Beefy Boxes and Bandwidth Generously Provided by pair Networks
go ahead... be a heretic
 
PerlMonks  

How store a hash table on disk and work with it

by ReneB (Initiate)
on Dec 04, 2008 at 08:13 UTC ( #727895=perlquestion: print w/ replies, xml ) Need Help??
ReneB has asked for the wisdom of the Perl Monks concerning the following question:

Hi, i am loading 2x3M tuples into two hash tables and have to compare them after some rulez. Problem is that the size of the hash tables in memory are about 600MB. Since the machine i have to run the program on has only 512MB installed. i was thinking about storing the stuff on disk and work from there. Can you tell me how this would be possible
my $recordsize=160; my ( $portingID, $cli ); print "${INFO}building hash table\t\t - '$infile'\n"; until( eof(IN) ){ read(IN, my $data, $recordsize) == $recordsize $cli =substr($data, 0,9); $portingID=substr($data,30,6); $old_tab{ $cli } = $portingID if $filenum==1; $new_tab{ $cli } = $portingID if $filenum==2; $actld++; } close IN;

Comment on How store a hash table on disk and work with it
Download Code
Re: How store a hash table on disk and work with it
by Corion (Pope) on Dec 04, 2008 at 08:21 UTC

    See tie and especially DB_File or DBM_File, or whatever other btree storage is hopefully installed on your Perl.

      many thanx, that was very helpful, will try to PPM that module.
Re: How store a hash table on disk and work with it
by oshalla (Deacon) on Dec 04, 2008 at 09:39 UTC

    I note that you have:

    $old_tab{ $cli } = $portingID if $filenum==1; $new_tab{ $cli } = $portingID if $filenum==2;
    You'd save memory by packing the old and new $portingID together and using just one hash table. Shouldn't take long to try it and see if it's enough of a saving.

      i have to compare the two has tables after some rules which are in the rest of the program, so thanx but i think no way to store all that stuff in one hash.
Re: How store a hash table on disk and work with it
by DrHyde (Prior) on Dec 04, 2008 at 11:05 UTC

    I was going to offer to help you, but then I saw "rulez". That stupid and deliberate mis-spelling is obviously a "joke" (although not a very funny one) so I'm going to give you an unfunny joke in response.

    First, write a very small C program which allocates all the memory on your machine except the bare minimum that perl needs. Have it lock all that memory so it can't be swapped out.

    Then run your perl program. All your hashes will be stored in the swap slice.

      what joke? this is only a part of the program an i have to load two different tables. sorry for not putting this into a separate func.

        Educated people who have English as a first language frown upon deliberate mis-spelling of words like "rules" as "rulez" - the reason is that those who consider themselves "cool" often do this in an effort to appear superior, but more professional developers aren't impressed by such behavior.

        I suspect, however, in your case that English isn't your first language and that the mis-spelling wasn't deliberate.

Re: How store a hash table on disk and work with it
by dragonchild (Archbishop) on Dec 05, 2008 at 12:49 UTC
    This is exactly why DBM::Deep was written.

    My criteria for good software:
    1. Does it work?
    2. Can someone else come in, make a change, and be reasonably certain no bugs were introduced?

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://727895]
Approved by ikegami
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others taking refuge in the Monastery: (4)
As of 2014-09-21 21:06 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    How do you remember the number of days in each month?











    Results (176 votes), past polls