laziness, impatience, and hubris PerlMonks

### Re^2: how are the number of buckets in a perl hash chosen

by david2008 (Scribe)
 on Jun 09, 2014 at 08:31 UTC ( #1089239=note: print w/replies, xml ) Need Help??

If i did understand exactly, from your example, that the power of 2 should be a prime.
I tried it on different examples and saw that this is not correct every time.
```perl  -E "%c = (3=>5);keys(%c) = 33;say  scalar(%c)"
1/64
Maybe i did not understand it 100%.
I looked at perldata and keys and saw indeed that the number of buckets is a power of 2 but it was not explained why.

Replies are listed 'Best First'.
Re^3: how are the number of buckets in a perl hash chosen
by Anonymous Monk on Jun 09, 2014 at 15:17 UTC

If i did understand exactly, from your example, that the power of 2 should be a prime.

I don't know, wolfram says "prime factorization"...

I looked at perldata and keys and saw indeed that the number of buckets is a power of 2 but it was not explained why.

Oh, why? because :) its coded that way :D

```newsize &= ~(newsize & (1 + ~newsize));    /* get proper power of 2 */
..
Newxz(a, PERL_HV_ARRAY_ALLOC_BYTES(newsize), char);

Programmers just love to do things in powers of two

In principle, a hashing function returns an array index directly; in practice, it is common to use its (arbitrary) return value modulo the number of buckets as the actual index. (Using a prime number of buckets that is not too close to a power of two tends to produce a sufficiently uniform key distribution.)
A Hash Function for Hash Table Lookup
The best hash table sizes are powers of 2. There is no need to do mod a prime (mod is sooo slow!).
Hash table
In the case that the array size is a power of two, the remainder operation is reduced to masking, which improves speed, but can increase problems with a poor hash function.

So there you have it, power of two, its for the speed

Create A New User
Node Status?
node history
Node Type: note [id://1089239]
help
Chatterbox?
 [Corion]: marioroy: Oh, that's always cool, having API-compatible modules. This makes testing and comparing things much easier [marioroy]: IPC in MCE::Shared can handle 400k (sends) per second. That's seems a lot for being a pure-Perl module. After making the release, will come back and post a solution for a node by a fellow wanting faster logging. [Corion]: While working on WWW::Mechanize:: Chrome, I had the suspicion that AnyEvent was doing something wrong, but I was able to swap it out for Mojolicious and the error persisted. [Corion]: Of course, the error was in my own code ;) [marioroy]: Corion, start and start_child in MCE::Hobo::Manager return a MCE::Hobo object, whereas P::FM returns the PID. I can have it return the PID though. I tried Hobo::Manager with several P::FM modules, just changed P::FM to MCE::Hobo::Manager and it works. [marioroy]: I also have a Hobo driver for Forklift allowing folks to use in multiple classes, no conflicts with one another. That's not possible for P::FM. [Discipulus]: congrats marioroy! [marioroy]: CORE::wait works well eventhough multiple instances or classes using Hobo::Manager.

How do I use this? | Other CB clients
Other Users?
Others pondering the Monastery: (5)
As of 2017-05-26 08:38 GMT
Sections?
Information?
Find Nodes?
Leftovers?
Voting Booth?
My favorite model of computation is ...

Results (189 votes). Check out past polls.