Beefy Boxes and Bandwidth Generously Provided by pair Networks
Think about Loose Coupling
 
PerlMonks  

out of memory problem

by lightoverhead (Pilgrim)
on Sep 12, 2008 at 22:24 UTC ( [id://711027]=perlquestion: print w/replies, xml ) Need Help??

lightoverhead has asked for the wisdom of the Perl Monks concerning the following question:

hi monks, I have created hashs of hashs for a very large data file just trying to use my 8GB RAM. But it has a out of memory problem. My code is fine for smaller size of data, but not the whole data (20000000 lines). the major chunk of code is as below:
while(<>){ if (/^#+/){NEXT;} else{ chomp; @data=split (/\s+/,$_); $a=$data[0]; $b=$data[3]; $c=$data[4]; $d=$data[5]; $row{$a}{$b}{$c}=$d; } } print "Fished establishing hashs.\n";
I don't think the problem resulted from limited memory since it reported "out of memory" when it only used 40% of my 8GM ram. Could any perl guru help me resolve this problem? Thanks.

Replies are listed 'Best First'.
Re: out of memory problem
by BrowserUk (Patriarch) on Sep 12, 2008 at 22:33 UTC
      Thank you all for your answers. I have checked our system, it showed:
      ..... Thu Jul 10 00:12:43 UTC 2008 i686 GNU/Linux
      Does this mean I am using a 32 bits systm which only allow perl to use 4GB no matter how large my actual RAM is?
Re: out of memory problem
by FunkyMonk (Chancellor) on Sep 12, 2008 at 23:05 UTC
    What do you expect NEXT to do?

    If you want to be more sure that your hash contains valid data, you should include:

    use strict; use warnings;

    at the start of your script, and fix all the errors reported.

    If you're going to claim that it was just a typo when you created your node, just think how easily the same typo could exist in your real code.

Re: out of memory problem
by Joost (Canon) on Sep 13, 2008 at 01:33 UTC
    what do you want us to do?

    if you just want to know why your program can't allocate more memory at 3Gb, one helpful pointer might be your ulimit manpage. Also: 3Gb may be your system's limit for any single process (especially if you're running 32 bit systems).

    edit: just to be clear; I've run multi-threaded perl processes at 12+ Gb without any problems on 64-bit linux systems.

Re: out of memory problem
by Cody Pendant (Prior) on Sep 13, 2008 at 00:24 UTC
    One obvious solution is to use some kind of tied hash where the data would actually be on disk, not in your RAM, impressive as it is...


    Nobody says perl looks like line-noise any more
    kids today don't know what line-noise IS ...
Re: out of memory problem
by salva (Canon) on Sep 13, 2008 at 09:30 UTC
    using a perl compiled for 64 bits would probably help...

    But, do you really need to maintain three level of hashes? that would be a complete memory hog for most kinds of data.

    Maybe storing everything in just one hash as $row{$a, $b, $c} = $d could suit your needs.

    And consider using SQLite also.

Re: out of memory problem
by blazar (Canon) on Sep 14, 2008 at 11:29 UTC
    while(<>){ if (/^#+/){NEXT;} else{ chomp; @data=split (/\s+/,$_); $a=$data[0]; $b=$data[3]; $c=$data[4]; $d=$data[5]; $row{$a}{$b}{$c}=$d; } } print "Fished establishing hashs.\n";

    I personally believe that your code exhibits several problems which may be or not be related to your memory usage one. (The latter may also lie outside of this code snippet.) First of all, it may be strict safe but at all effects seems not to: thus the first and best recommendation I can give you is to make it so!

    Your code is also horribly and painstakingly indented, and other details suggest it may be retyped: in which case... don't! Paste it instead! One of the details is NEXT (note the capitalization!) which may actually be a sub of yours, but rather appears as a typo for next. If it is, then you don't need the else clause: that's precisely what next is about!

    Then, your split is very close to an argumentless one, which is "optimized" to do what one generally wants to, I know it may not be your case, but I suspect it is.

    Coming closer to your actual problem, do you really need a three level HoH? That creates quite a lot of references, so you may want to concoct up a single key suitably joining the keys you have now, if possible: perl even does this for you with an old (but perfectly working!) mechanism of pseudo multilevel hashes. All in all, your code may be transformed into:

    my %row; while(<>){ next if /^#/; # if there's "one or more" then there's one! chomp; my @data=split; $row{ @data[0,3,4] }=$data[5]; }
    --
    If you can't understand the incipit, then please check the IPB Campaign.
Re: out of memory problem
by betterworld (Curate) on Sep 13, 2008 at 12:55 UTC
    @data=split (/\s+/,$_); $a=$data[0]; $b=$data[3]; $c=$data[4]; $d=$data[5];

    Note that because you don't use "my" here, all these variables will stay in memory even after the loop has finished. If the last line is very long, this might be a memory problem.

    Anyway, this code does not look like it is strict. You should fix this, and maybe then you discover more problems in your code.

      Note that because you don't use "my" here, all these variables will stay in memory even after the loop has finished.

      That happens when using my too. Neither the SV nor the string buffer are freed. They are simply cleared.

      Update: The following isn't proof, but it does let you see the effect of what I described.

        You're right. The reuse of the address could be coincidence, but as the LEN stay 12, we can conclude that the buffer was never freed.

        I've done a little bit of testing too:

        One question remains: When does perl free memory? I can't quite believe that every lexical variable that has ever been used results in large stale memory blocks. This would be against the spirit of this excerpt from perlsyn:

        You wouldn't want memory being free until you were done using it, or kept around once you were done. Automatic garbage collection takes care of this for you.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://711027]
Approved by jethro
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others taking refuge in the Monastery: (6)
As of 2024-04-18 18:15 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found