Beefy Boxes and Bandwidth Generously Provided by pair Networks
good chemistry is complicated,
and a little bit messy -LW

Re^2: Array vs. Hash for sparsely integer-indexed data

by puterboy (Scribe)
on Jan 29, 2013 at 03:55 UTC ( #1015793=note: print w/replies, xml ) Need Help??

in reply to Re: Array vs. Hash for sparsely integer-indexed data
in thread Array vs. Hash for sparsely integer-indexed data

First of all thanks to *everyone* for all the helpful advice.

Just some clarification. I am caching hard disk inodes since I need to look up hard links (it's to copy a highly hard-linked but structured tree of folders which is too slow to do using rsync and 'cp -a' since they (obviously) don't know the special structure).

So the key is an integer less than max inodes on the filesystem. The value is a 32-digit hex number expressed as an ascii string

Sometimes the list can be pretty dense if for example the tree being copied occupies the majority of the filesystem and if most of the inodes are used (or at least if they are allocated relatively contiguous); other times it could be o(10%) or less.

Each inode hash entry may be looked up anywhere from once to 10s or even thousands of times (presumably no more than maxhardlinks).

After doing some profiling, it seems that the speed of lookup is less important than memory usage since the rate limiting step is fstat'ing files which is significantly slower.

So, it seems that hashes are best since except in the densest of cases, they will require less storage.

Again, thanks for all the helpful advice. I learned a lot!
  • Comment on Re^2: Array vs. Hash for sparsely integer-indexed data

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://1015793]
and all is quiet...

How do I use this? | Other CB clients
Other Users?
Others taking refuge in the Monastery: (8)
As of 2018-06-24 17:38 GMT
Find Nodes?
    Voting Booth?
    Should cpanminus be part of the standard Perl release?

    Results (126 votes). Check out past polls.