Disk based hash (as opposed to RAM based)by techtruth (Novice)
|on Oct 07, 2012 at 19:37 UTC||Need Help??|
techtruth has asked for the
wisdom of the Perl Monks concerning the following question:
Hello monks, I would like to call upon your wisdom again today.
How can I create a hash that stores it's data to a physical disk?
I have looked over some documentation and it looks like there are may ways to do this, but I can't seem to make sense of it. One method involved writing a special perl module and defining my own functions for "store", "delete", etc. I would like to avoid that of possible. I also saw the use of tie and tie::stdhash but found them to be confusing to me. Currently I am using tie with DB_File to tie a hash to a DB file, but am having trouble inserting new data.
I have a need to store roughly 5gb of data as a hash of arrays, thus I need to not use my systems RAM. My problem comes when I attempt to push a new value onto an array.
Is there a simple way to do this that I am missing? My code functions without writing the hash to a file, but fails when I tie the hash to a disk. The speed of read/writes on the hash is still of importance to me, although I realize writing to disk is much slower than RAM.Here is an example of my code:
I understand that if I write my own handlers for "store" "delete", etc I could make the values be appended to an array each time a new value was assigned, but would like to stay away from hairy situations... Update:
I need to store around 1000 values in each array.Solved:
I have returned from MySQL land with a solution. Since my input data is formatted as strings, "value.key" I wrote them in bulk to a temporary file. I then used MySQLs load_data_infile function to populate a temporary table. I then used insert with combinations of MySQLs string functions to make a table with two columns: key and value. The insert function took all data in the temporary table and inserted it into the new 2-column table. Now I can "select where key equals" to emulate perls amazing hashes. Not as fast as I would like it to be now, but I can process massive input files.
Thank you monks. I accept and appreciate your wisdom.