Beefy Boxes and Bandwidth Generously Provided by pair Networks
Keep It Simple, Stupid
 
PerlMonks  

Re: When does Perl double the number of buckets in hash?

by Anonymous Monk
on Dec 02, 2011 at 06:18 UTC ( [id://941242]=note: print w/replies, xml ) Need Help??


in reply to When does Perl double the number of buckets in hash?

for the following code: %hash=(1,2,3,4,5,6,7,8,9,10,11,12,15,16,22,34,88,99); $val=%hash; print $val; the output is :7/8 can some-one Explain???
  • Comment on Re: When does Perl double the number of buckets in hash?

Replies are listed 'Best First'.
Re^2: When does Perl double the number of buckets in hash?
by BrowserUk (Patriarch) on Dec 02, 2011 at 06:48 UTC
    the output is :7/8 can some-one Explain???

    No! It is easy to see that keys 11 & 15 each hash to the same bucket as one of the previous keys, as the buckets used number doesn't increase when they are added:

    $hash{ $_->[0] } = $_->[1] and print "@$_\t", scalar keys %hash, scala +r %hash for [1,2],[3,4],[5,6],[7,8],[9,10],[11,12],[15,16],[22,34],[88,99] +;; 1 2 1 1/8 3 4 2 2/8 5 6 3 3/8 7 8 4 4/8 9 10 5 5/8 11 12 6 5/8 * 15 16 7 5/8 * 22 34 8 6/8 88 99 9 7/8 ???

    But why the size of the hash was not doubled when the key count equalled the number of buckets is not clear. It seems to suggest that this is also not the complete story here?


    With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.

    The start of some sanity?

      In your example, doubling does not occur because there is no collision. As I understand, the Perl code in hv.c first checks if there is a collision. If there is, the code compares the total number of keys (including the one we just added) with the number of buckets. If the former is greater or equal to the latter, then the number of buckets is doubled. Otherwise, Perl checks the total number of keys in that particular bucket. If there are more than HV_MAX_LENGTH_BEFORE_SPLIT (set to 14) keys in that bucket, the number of buckets is also doubled.

      I still don't understand one thing, though. In hv.c code we have the comparison (xhv->xhv_keys > (IV)xhv->xhv_max). This seems to suggest that doubling occurs only when the number of keys (including the new one) is more than the number of buckets. But as I shown in the original post, the doubling can occur even when the number of keys equals to the number of buckets.

        I still don't understand one thing, though. In hv.c code we have

        I spent some time looking at the sources a while back and I couldn't make sense of them then either. Hence I tend to base my appreciation upon what I can see.

        Beyond having a feel for what happens under the covers there doesn't seem to be much use for knowing exactly how this works, so it's never been a big priority to dot the i's and cross the t's.


        With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        "Science is about questioning the status quo. Questioning authority".
        In the absence of evidence, opinion is indistinguishable from prejudice.

        The start of some sanity?

        This seems to suggest that doubling occurs only when the number of keys (including the new one) is more than the number of buckets.

        MAX is the highest bucket index (0-based), not the number of buckets. From illguts,

        KEYS is the number of hash elements in the HASH.

        MAX is the number of elements in ARRAY minus one.

        So KEYS >= MAX+1 and KEYS > MAX are true if the number of keys is equal to the number of buckets.

      As you can see, the used number of buckets went up (6⇒7), so a collision did not occur, so there was no reason for a split.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://941242]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others surveying the Monastery: (5)
As of 2024-04-24 03:42 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found