Beefy Boxes and Bandwidth Generously Provided by pair Networks
laziness, impatience, and hubris

Re^2: Proof of concept: File::Index

by davido (Archbishop)
on May 12, 2006 at 15:25 UTC ( #549037=note: print w/replies, xml ) Need Help??

in reply to Re: Proof of concept: File::Index
in thread Proof of concept: File::Index

Tie::File builds the index every time you tie the file, and doesn't store the index. That means every time a script starts up, Tie::File has to skim through the entire 'big file' to find all line endings or record separators. Building this index consumes O(n) time, every time the index is built. The entire point of File::Index is to avoid rebuilding the index file unless you specifically tell it to do so.

File::Index is helpful for 'big files' that don't change frequently. File::Index builds an index and stores it for future use. That way, the next time the script is run, the index alread exists, and item lookups will occur in O(1) time. Building the index still consumes O(n) time, but the index is only built once, or at worst, when you tell it to be rebuilt.

I also used packed longs for the index so that the index file is as compact as is practical. And finding the Nth entry in the index file doesn't require reading through the entire file, it just involves seeking to LONG_SIZE * entry-number into the index file... an O(1) operation.


Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://549037]
and all is quiet...

How do I use this? | Other CB clients
Other Users?
Others having an uproarious good time at the Monastery: (5)
As of 2018-04-21 00:59 GMT
Find Nodes?
    Voting Booth?