Beefy Boxes and Bandwidth Generously Provided by pair Networks
good chemistry is complicated,
and a little bit messy -LW
 
PerlMonks  

Re^2: how are the number of buckets in a perl hash chosen

by david2008 (Scribe)
on Jun 09, 2014 at 08:31 UTC ( #1089239=note: print w/ replies, xml ) Need Help??


in reply to Re: how are the number of buckets in a perl hash chosen
in thread how are the number of buckets in a perl hash chosen

If i did understand exactly, from your example, that the power of 2 should be a prime.
I tried it on different examples and saw that this is not correct every time.

perl -E "%c = (3=>5);keys(%c) = 33;say scalar(%c)" 1/64
Maybe i did not understand it 100%.
I looked at perldata and keys and saw indeed that the number of buckets is a power of 2 but it was not explained why.


Comment on Re^2: how are the number of buckets in a perl hash chosen
Download Code
Re^3: how are the number of buckets in a perl hash chosen
by Anonymous Monk on Jun 09, 2014 at 15:17 UTC

    If i did understand exactly, from your example, that the power of 2 should be a prime.

    I don't know, wolfram says "prime factorization"...

    I looked at perldata and keys and saw indeed that the number of buckets is a power of 2 but it was not explained why.

    Oh, why? because :) its coded that way :D

    newsize &= ~(newsize & (1 + ~newsize)); /* get proper power of 2 */ .. Newxz(a, PERL_HV_ARRAY_ALLOC_BYTES(newsize), char);

    Programmers just love to do things in powers of two

    perl hash presize why power of 2 How Hashes Really Work - Perl.com

    In principle, a hashing function returns an array index directly; in practice, it is common to use its (arbitrary) return value modulo the number of buckets as the actual index. (Using a prime number of buckets that is not too close to a power of two tends to produce a sufficiently uniform key distribution.)
    A Hash Function for Hash Table Lookup
    The best hash table sizes are powers of 2. There is no need to do mod a prime (mod is sooo slow!).
    Hash table
    In the case that the array size is a power of two, the remainder operation is reduced to masking, which improves speed, but can increase problems with a poor hash function.

    So there you have it, power of two, its for the speed

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://1089239]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others rifling through the Monastery: (5)
As of 2014-10-02 04:29 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    What is your favourite meta-syntactic variable name?














    Results (48 votes), past polls