Beefy Boxes and Bandwidth Generously Provided by pair Networks
Just another Perl shrine
 
PerlMonks  

Re: how are the number of buckets in a perl hash chosen

by oiskuu (Friar)
on Jun 09, 2014 at 08:42 UTC ( #1089241=note: print w/ replies, xml ) Need Help??


in reply to how are the number of buckets in a perl hash chosen

Choosing a good hash function is a bit of a black magic. Modulo prime might yield a better bucket distribution in simple cases with simple hash functions, e.g. when the keys fall into regular label+number pattern.

In practice, the modulo operation (by a variable) is often insufferably slow compared to arithmetic shift or masking. Therefore, power-of-two bucket sizes are preferred, together with a robust hash function. Strong (and randomized) hash function is a necessity nowadays. Does the course mention security aspects of generic hash implementations, such as resistance to DoS attacks?


Comment on Re: how are the number of buckets in a perl hash chosen
Re^2: how are the number of buckets in a perl hash chosen
by david2008 (Scribe) on Jun 09, 2014 at 09:00 UTC
    It did speak generally about security without going into the gory details. We learned about universal hashing and why it is important.
    It was not mentioned that the modulo operation is slow. The fact that the perl hash function manages to spread the entries evenly into buckets, despite the fact that the number of buckets is even is surprising.
    I guess that i will have to read the source code to understand exactly how this happens.
    This is also what Tim Roughgarden suggests. He says that if we want to understand a data structure well, we should either read a text book about the issue, or an open source implementation.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://1089241]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others about the Monastery: (13)
As of 2014-12-19 10:09 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    Is guessing a good strategy for surviving in the IT business?





    Results (77 votes), past polls