Beefy Boxes and Bandwidth Generously Provided by pair Networks
Do you know where your variables are?

Re: how to merge many files of sorted hashes?

by sundialsvc4 (Abbot)
on Feb 03, 2012 at 14:52 UTC ( #951665=note: print w/replies, xml ) Need Help??

in reply to how to merge many files of sorted hashes?

Sometimes, when I have “hairy things referring to other hairy things,” such as a matrix referring to another matrix, I find it useful to introduce the concept of surrogate keys.   This sort-of goes back to, “if I had to store all this stuff in an old-fashioned library card catalog, which drawer would I put it in and why?”   First of all, I would assign every matrix that I had (no matter how I intended to use it) a random unique identifier such as a UUID.   Then, I would produce some kind of arithmetic hash-value that would allow me to sift through all the data that I had in order to find it faster.   Something simple, like the sum of every number in the list after truncating that number to an integer.   I’d tag the information with that value, and then, look only for that tag.   (The original notion of “hashing.”)

You don’t necessarily have to put those hundreds of megabytes of data into a database.   (SQLite, by the way, is an excellent tool for this.)   You just need to find a way to use a database file to catalog it ... to tell you which file it’s in, and where.   To reduce the amount of search time that you must spend to find a particular piece of information:   not reducing that time to zero, just reducing it.

  • Comment on Re: how to merge many files of sorted hashes?

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://951665]
and all is quiet...

How do I use this? | Other CB clients
Other Users?
Others studying the Monastery: (5)
As of 2018-02-21 15:50 GMT
Find Nodes?
    Voting Booth?
    When it is dark outside I am happiest to see ...

    Results (283 votes). Check out past polls.