Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl Monk, Perl Meditation

Re: A short meditation about hash search performance

by Abigail-II (Bishop)
on Nov 17, 2003 at 09:34 UTC ( #307611=note: print w/replies, xml ) Need Help??

in reply to Re: Re: A short meditation about hash search performance
in thread A short meditation about hash search performance

Yes, you are right that length of a key plays a role in calculating the hash value. And it plays a role in comparing two keys as well. You can take this time into account, and say insertion/searching takes O (k), where k is the length of input. There's no relationship between k and n though, and usually we aren't interested in this factor. We just define that calculating a hash value can be done in constant time, and so can comparing two keys.

As for the doubling, this factors out (assuming it isn't possible to construct a set of keys such that even after repeated doubling, they keep hashing to the same values). The sketch of the proof is as follows: suppose we rebuild the hash after N inserts; that is, the hash was rebuild when it contains N keys. Building a hash out of N keys will take O (N) time - this is O (1) per key. Now you also have to show that the next rebuild isn't taking place after inserting another c * N keys, for some constant c > 1. This means that if you rebuild a hash with N keys, for at least N / (1 + c) keys, this is the first time they are involved, for at least N / (1 + c)^2 keys, this is the second time they are involved in a rebuild, etc. If you do the math, you will see that there are some keys that have been charged O (log N) on rebuilds, but because there are so many more who have been charged less, it works out to O (1) amortized time. So, yes, a single insert can take O (1) time, but, starting from an empty hash, N inserts take O (N) time.

Rebuilding after a bunch of inserts is actually a well-known technique for datastructures. Often a datastructure is only partially rebuild (giving the technique its name: "partially rebuilding"). Rebuilding the entire datastructure is just an extreme variant.


  • Comment on Re: A short meditation about hash search performance

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://307611]
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others rifling through the Monastery: (6)
As of 2019-04-23 11:55 GMT
Find Nodes?
    Voting Booth?
    I am most likely to install a new module from CPAN if:

    Results (117 votes). Check out past polls.