Beefy Boxes and Bandwidth Generously Provided by pair Networks
Do you know where your variables are?
 
PerlMonks  

Re: When does Perl double the number of buckets in hash?

by BrowserUk (Pope)
on Nov 30, 2011 at 22:37 UTC ( #940957=note: print w/ replies, xml ) Need Help??


in reply to When does Perl double the number of buckets in hash?

Remember that keys != buckets. Sometimes, and more frequently than you might imagine especially with small hashes, two or more keys will hash to the same bucket.

That said, from a cursory inspection, it seems to be when the number of buckets in use exceeds 2/3rds of the buckets available:

++$h{$_ } and print scalar keys %h, ' : ', scalar %h for 'aaaa' .. 'zz +zz';; 1 : 1/8 2 : 2/8 3 : 2/8 ## hash colision 4 : 2/8 ## hash colision 5 : 3/8 6 : 4/8 7 : 5/8 ## 5/8ths 8 : 6/16 9 : 6/16 10 : 7/16 11 : 7/16 ## hash colision 12 : 8/16 13 : 9/16 14 : 9/16 ## hash colision 15 : 9/16 ## hash colision 16 : 10/16 ## 5/8ths 17 : 15/32 18 : 15/32 ## hash colision 19 : 16/32 20 : 16/32 ## hash colision 21 : 16/32 ## hash colision 22 : 17/32 23 : 17/32 ## hash colision 24 : 18/32 25 : 19/32 26 : 19/32 ## hash colision 27 : 20/32 28 : 20/32 ## hash colision 29 : 20/32 ## hash colision 30 : 20/32 ## hash colision ## 5/8ths 31 : 21/32 32 : 25/64 ... 60 : 40/64 61 : 40/64 ## hash colision 62 : 40/64 ## hash colision 5/8ths 63 : 41/64 ## 2/3rds 64 : 51/128 ... 128 : 85/128 ## 2/3rds 129 : 106/256 130 : 107/256

With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.

The start of some sanity?


Comment on Re: When does Perl double the number of buckets in hash?
Download Code
Re^2: When does Perl double the number of buckets in hash?
by ikegami (Pope) on Dec 01, 2011 at 00:36 UTC

    As shown by the code the OP posted, it actually happens when the number of elements (including the newly inserted element) is equal to the number buckets. What the OP missed is that it only happens if there's a collision.

    Using your numbers:

    1 : 1/8 2 : 2/8 3 : 2/8 ## hash colision 4 : 2/8 ## hash colision 5 : 3/8 6 : 4/8 7 : 5/8 8 : 6/16 ## 8 == 8 => split 9 : 6/16 10 : 7/16 11 : 7/16 ## hash colision 12 : 8/16 13 : 9/16 14 : 9/16 ## hash colision 15 : 9/16 ## hash colision 16 : 10/16 ## 16 == 16 => split 17 : 15/32 18 : 15/32 ## hash colision 19 : 16/32 20 : 16/32 ## hash colision 21 : 16/32 ## hash colision 22 : 17/32 23 : 17/32 ## hash colision 24 : 18/32 25 : 19/32 26 : 19/32 ## hash colision 27 : 20/32 28 : 20/32 ## hash colision 29 : 20/32 ## hash colision 30 : 20/32 ## hash colision 31 : 21/32 32 : 25/64 ## 32 == 32 => split ... 60 : 40/64 61 : 40/64 ## hash colision 62 : 40/64 ## hash colision 5/8ths 63 : 41/64 64 : 51/128 ## 64 == 64 => split ... 128 : 85/128 129 : 106/256 ## 128 == 128 => split 130 : 107/256

    There's a second condition that causes a split: A degenerate hash is detected. A degenerate hash is one that has a bucket with so many element as to make it slow to find keys in that bucket. I didn't try to determine the exact condition for when this occurs.

      Indeed. Thank you for the correction. I guess I was concentrating on the wrong column.


      With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.

      The start of some sanity?

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://940957]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others wandering the Monastery: (6)
As of 2014-11-23 16:16 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    My preferred Perl binaries come from:














    Results (134 votes), past polls