Beefy Boxes and Bandwidth Generously Provided by pair Networks
Clear questions and runnable code
get the best and fastest answer
 
PerlMonks  

Re^5: RFC on Inline::C hack: Hash_Iterator

by Tanalis (Curate)
on Aug 02, 2005 at 11:51 UTC ( #480153=note: print w/ replies, xml ) Need Help??


in reply to Re^4: RFC on Inline::C hack: Hash_Iterator
in thread RFC on Inline::C hack: Hash_Iterator

The structures you are talking about are defined by Perl itself.

I didn't know that, and I find it interesting. The cost involved to point back to the previous record is effectively negligable; however, I agree that the benefits are also minimal.

The linked lists used for buckets in perls hashes are intended to be extremely small, ie, generally they should hold only one element, and except for degenerate cases should not really exceed two elements. With this in mind a binary tree approach makes less sense as in most cases you will derive no benefit from it at all.

Agreed, from a Perl perspective. I should point out that I was aiming to talk in a more general way with that comment - though I accept that I didn't make that clear. My experience is largely with Other Languages that perhaps don't have such clever internals as Perl and where the underlying data structures can make the difference between terrible and acceptable performance speeds for extremely heavily loaded hash tables.


Comment on Re^5: RFC on Inline::C hack: Hash_Iterator
Re^6: RFC on Inline::C hack: Hash_Iterator
by demerphq (Chancellor) on Aug 02, 2005 at 12:58 UTC

    My experience is largely with Other Languages that perhaps don't have such clever internals as Perl and where the underlying data structures can make the difference between terrible and acceptable performance speeds for extremely heavily loaded hash tables.

    Im guessing that you mean scenarios where you have to hand code your own hash table implementations. Im also guessing that you mean scenarios where you have to work with statically sized hash tables. In such a scenario I could see your point for sure. A hash table of small threaded binary trees sounds like a good design to me.

    However, just for your edification ill outline in general how perls hashes work: first, the size of the hash table is always a power of 2, starting at 8 elements large. When the bucket chain length starts getting too long (calculated I beleive by determining the ratio of the number of keys to the number of buckets) the size of the hash array is doubled and the keys of the original are remapped into the new hash array. The hash values are not recalculated as the power of two rule implies that the remapping can occur simply by anding a different bit mask with the hash values to determine the new slot in the array. In normal circumstances the actual keys are stored only once, in a master hash, with pointers from the buckets of the actual hash buckets to the master hashes buckets (which actually contain the key string). This key sharing is important as hashes are the most common way of representing objects which will by-and-large tend to have many keys in common.

    ---
    $world=~s/war/peace/g

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://480153]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others chanting in the Monastery: (8)
As of 2014-12-28 02:52 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    Is guessing a good strategy for surviving in the IT business?





    Results (178 votes), past polls