dimmesdale has asked for the wisdom of the Perl Monks concerning the following question:

Monks...

I have a script I'm running, and as part of (the MUCH larger) application, there's a function that takes up so much memory that perl tells me i'm out of it. I've had it working elsewhere (I've modified it slightly and added some things), but need help.

The file I'm processing is REALLY large (hence the line by line read in) -- and when I say large, I'm talking about 500-2400 KB in size.

I've done some debugging and the results are REALLY confusing to me. Everything works fine ... until I get to the last statement of the while loop. I added a print I'm at the end of the while loop-ish statement, and it prints. However, a print I'm out of a while loop-ish statement doesn't print. IT NEVER LEAVES THE LOOP, but stops looping, seemingly.

Here's the code:

sub process_rare_RE { my $file = shift; my $PART3 = 0; my ($sums,$avgs,$ydata,@xdata,@frmt,@values); open(IN,$file) or die "$file failed to open: $!"; ## Reset sums for (0..3) { $sums->[$_] = 0 } while (<IN>) { @values = split /\t/, $_; if ( ($values[4] == 0 && $values[0] >= 2.5) || ($file =~ /Ts21/ && $values[0] == 2.5) ) { $PART3 = 1 } if($PART3) { next if $values[1] < 50; if ($_[0] == 3 || $_[0] == 1) { for (0..2) { $sums->[$_] += $values[$_+1] } $sums->[3]++; } if ($_[0] == 2 || $_[0] == 1) { push @xdata, $values[0]; for (0..2) { push @{$ydata->[$_]}, $values[$_+1] } } } } ## never gets here, hung up above (^) close IN; for (0..$#xdata) { push @frmt, [ $xdata[$_], \@{$ydata} ] } for (0..2) { $avgs->[$_] = $sums->[$_]/$sums->[3] } if ($Debug > 2) { print "\t\tReturn value for process_rare_RE is: \n +", Dumper( ($avgs,\@frmt) ) } return ( $avgs, \@frmt ); }

Replies are listed 'Best First'.
Re: Out of memory help
by dimmesdale (Friar) on Jul 19, 2002 at 17:26 UTC
    Oops ... a debugging statement somehow slipped in there. Oh well. If it helps, I thought I'd show you a line of the data:

    0 68.1818 77.0728 0.0445557 5 0.000333333 68.1818 77.0743 0.0436401 5 0.000666667 68.1818 77.0728 0.0436401 0 0.001 68.1818 77.0758 0.0436401 0 0.00133333 68.1818 77.0712 0.0445557 0 0.00166667 68.1818 77.0728 0.0436401 0 0.002 68.1818 77.0743 0.0439453 0 0.00233333 68.1818 77.0728 0.0436401 0 0.00266667 68.1818 77.0743 0.0436401 0 0.003 68.1818 77.0743 0.0448608 5 0.00333333 68.1818 77.0743 0.0442505 5 0.00366667 68.1818 77.0789 0.0439453 5 0.004 68.1818 77.0758 0.0436401 5 0.00433333 68.1818 77.0743 0.0436401 5 0.00466667 68.1818 77.0773 0.0439453 5

    The 0.xxxxxxxxx (first column) is the time stamp. To get an idea of the file size, realize that the time goes up to 4.3 - 5+ depending on the file. (and at .00033333 increments that's a long way)

Re: Out of memory help
by RMGir (Prior) on Jul 19, 2002 at 17:37 UTC
    UnEdit: Hmm, maybe it IS -w clean...

    You never reset $PART3 to 0, is that deliberate?

    I don't see how that would make you hang, though.

    Apart from that, I'm lost in a maze of AoAs :)

    Oh, you are using @values in some spots, and @_ in others... Also not likely to hang you, but it is a bit odd...
    --
    Mike

      Hmmm, what version of perl are you running?

      My current perl doesn't have a problem with

      for (0..$#xdata) { push @frmt, [ $xdata[$_], \@{$ydata} ] }
      when @xdata is empty, but maybe you have an older one, and it infinite loops on the 0..-1?

      I know this is below where you think the script stops, but I notice that you don't do your data dump until 3 lines past the close...

      I don't recall if perl ever had such a misbehaviour, but it could explain why things don't work the way you expect, since @xdata is populated in the part of the loop where you're looking at @_ instead of @values.

      Please try

      perl -e'for(0..$#a){print "$_\n"}'
      from the command line, and make sure it returns without printing anything. If that's the case, this isn't the bug. If it DOES print out an endless stream of numbers, then there is your problem.
      --
      Mike
        perl 5.6.1, active state build 633

        I apologize for any misunderstandings... I took out the debugging statements when I posted (but one got through). The code most certainly doesn't get by the while loop. I had it printing the values of the time stamp and it stops at the corresponding value of the last time stamp in the file I tested it on. (by the way, the $_[0]variable I reference from time to time is just a quick hack I threw in ... it's an option to help in debugging somewhat, and somewhat as a test for a future feature. What it allows is for the user to specify that certain data has already been processed and is in a specified file (or the default one), so don't bother doing it again. I process certain files in this routine and get both data for an XY scatter plot, and also calculate the averages, so I threw that in.

        And, $PART3 should not be set to 0 once I get it set. There are three button clicks in the file (represented by a 0 -- a 5 if none). I want the data from the last button click onward -- there are several little 'complications' with this reasoning, but it doesn't really affect anything here, and is a result of the data mainly.