Beefy Boxes and Bandwidth Generously Provided by pair Networks
Clear questions and runnable code
get the best and fastest answer
 
PerlMonks  

Re^2: [OT] The statistics of hashing. (accuracy)

by tye (Cardinal)
on Apr 02, 2012 at 21:55 UTC ( #963123=note: print w/ replies, xml ) Need Help??


in reply to Re: [OT] The statistics of hashing.
in thread [OT] The statistics of hashing.

I should review my transcendental function definitions as power series so I can recognize these correlations in future. I knew how to precisely measure the odds but had no clue how to search to see if there were any way to collapse it to some transcendental function (the exact measurement is quite impractical for very large numbers of inserts).

But having 1-exp(-$inserts/$slots) shown to me, I was able to play with that approximation to see how accurate it is. For this application, it is actually an impressive fit.

If 1-exp(-1/$slots) starts out very close to 1/$slots (the correct starting value for the odds of a single-bit collision), then the accuracy never decreases as $inserts grows. For small $slots, 1-exp(-1/$slots) is not very accurate but 1-exp(-$inserts/$slots) slowly becomes more and more accurate as $inserts increases.

For $slots == 2**32, 1-exp(-$inserts/$slots) starts out (at 1==$inserts) so accurate that a 'double' can't even measure the error. So you can consider the calculations based on it "perfect" rather than an approximation. :)

- tye        


Comment on Re^2: [OT] The statistics of hashing. (accuracy)

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://963123]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others meditating upon the Monastery: (8)
As of 2014-10-20 22:45 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    For retirement, I am banking on:










    Results (92 votes), past polls