Beefy Boxes and Bandwidth Generously Provided by pair Networks
Clear questions and runnable code
get the best and fastest answer
 
PerlMonks  

Perl script end up on saying "Out of Memory !"

by syedumairali (Initiate)
on Sep 20, 2011 at 07:42 UTC ( #926841=perlquestion: print w/replies, xml ) Need Help??
syedumairali has asked for the wisdom of the Perl Monks concerning the following question:

Hi Monks,

I am creating a perl script which searches for a specific text in the CSV text file (100000 rows & 30 KB) and there are huge huge numbers of files. I am usign hashkeys to first put one file into the hash. And then search the specific text. After It finishes searches. I uses the same hash function to copy the second csv file of same size and search for a specific function.

The script ran perfectly for the 60 odds files but after that it crashes with "Out of Memory !".

While running script I am also observing from task manager the size of available memory continuously decreasing (2GB ram).

I think I am missing clearing the hash variable (@data1) and the the error message comes when my hach fully utilizes the full memory.

Question : How can I erase or clear the hash before my perl script takes the second file ? here is the sample code (shown only relevant code)
# @ lines contain csv files my %data1; shift(@lines1); # remove column headings from file shift(@lines1); # remove column headings from file foreach my $line (@lines1) { @words = split (/\,/, $line); if ($words[6] > 90) { my $abstime = $words[1]; my $payload = $words[5]; $srcIPhex = substr $payload, 24, 8; my $dstIPhex = substr $payload, 32, 8; my $timestamp = substr $payload, 152, 12; my $HashKey; # to get total number $HashKey = $srcIPhex.$abstime; $data1{$HashKey}{ID} = $words[0]; $data1{$HashKey}{SRC_IP} = $srcIPhex; $data1{$HashKey}{DST_IP} = $dstIPhex; MeasureFiles(\%data1); } sub MeasureFiles { my ($list_a_ref) = @_; my %data1 = %$list_a_ref; # Dereference lists .... .... foreach (keys %data1) { $SrcIP_captured = inet_ntoa( pack( "N", hex( $data1{$_}{SRC_IP} ) +) ); $DstIP_captured = inet_ntoa( pack( "N", hex( $data1{$_}{DST_IP} ) +) ); foreach(my $i=0;$i<$ind;$i++){ if ( $SrcIP_captured eq $SrcIP_ref[$i] && $DstIP_captured eq $ +DstIP_ref[$i]) { $pkt_received++; + } } } .... .... open(R1,">> $mainDirectory\\Results\\$file_result") || die("Cannot + Open File $file_result"); my $results = "$SrcIP_ref[$i],$DstIP_ref[$i],$pkt_received"; print R1 "$results\n"; close(R1); }

Replies are listed 'Best First'.
Re: Perl script end up on saying "Out of Memory !"
by moritz (Cardinal) on Sep 20, 2011 at 07:54 UTC
    Question : How can I erase or clear the hash before my perl script takes the second file ?

    The simplest way is to declare it in such a way that it goes out of scope when you stop processing the file. Something along the lines of:

    for my $filename (@files) { my %data1; # do all processing of file $filename here }

    Alternatively you can use undef %data1

    my %data1 = %$list_a_ref; # Dereference lists

    That doesn't just dereference, it also creates a copy. Do you want that?

      Thanks Moritz, for your guidance. Refer to your question. Infact I donot want the copy of hash inside a MeasureFiles subroutine. Can you help me may how to only get the reference and not the copy of the hash inside the routine. Thanks !
Re: Perl script end up on saying "Out of Memory !"
by armstd (Friar) on Sep 23, 2011 at 14:35 UTC

    Since it appears each file is processed independently of each other, and no state is maintained in memory, you might also consider forking processes to handle each file instead of doing it directly in one process. Your parent process won't be affected by any memory consumed by child processes.

    Also, if you 'exec "/bin/true"' or some-such instead of 'exit()' at the end of each child process, you'll find that memory frees up much faster than waiting for perl garbage collection, helping performance too.

    --Dave

Re: Perl script end up on saying "Out of Memory !"
by pvaldes (Chaplain) on Sep 23, 2011 at 19:25 UTC
    foreach my $line (@lines1) { @words = split (/\,/, $line); if ($words[6] > 90) { ... }

    I miss an else statment here, or maybe

    while my $line(@lines1) { @words = split (/\,/, $line,8); next if $words[6] <= 90; ...

    Foreach requires typically more memory than while (and you have several foreach loops), use while instead unless you have good reasons to use a foreach loop

    I am usign hashkeys to first put one file into the hash. And then search the specific text.

    If you have a lot of files and you expect a lot of non matching lines try to discard these undesired files/lines as soon as possible. Sounds to me like a work for grep, regexp and next

    You could want not to care for what's after the seventh field, if this is your case, put a max num of fields in split. Thus split should end before and require less memory.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://926841]
Approved by Corion
help
Chatterbox?
[Corion]: hippo: Yes, but I'm not sure that it's even worth the effort of implementing it at all...
[Corion]: You'll only ever need that option if you have a long-running query whose results are not cached by your DB already, and in those cases I presume that the programmer will want to maintain the temporary tables themselves - I wouldn't know when to drop ...
[Corion]: ... the temporary tables, and also don't have a good idea on how to create unique table names for them
[hippo]: OIC. In that case leave it out but invite feature requests and see if any of the users suggest it. :-)
[Corion]: Talking about this makes me realize that it's likely only a half useful idea. But it still would be convenient to have as an option...
[Corion]: hippo: Hmmm - yeah, I could document it and wait for code implementing that option to show up ;-D

How do I use this? | Other CB clients
Other Users?
Others perusing the Monastery: (11)
As of 2017-02-23 15:36 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    Before electricity was invented, what was the Electric Eel called?






    Results (347 votes). Check out past polls.