Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl: the Markov chain saw

Out of Memory

by Michael Kittrell (Novice)
on Mar 27, 2013 at 17:45 UTC ( #1025770=perlquestion: print w/replies, xml ) Need Help??

Michael Kittrell has asked for the wisdom of the Perl Monks concerning the following question:

Can someone explain to me the difference between
my $nulls = $_; $nulls =~ /\0/g; $nulls = length($nulls);
my $nulls = () = $_ =~ /\0/g;
I get an "Out of memory!" exception with the 2nd example (once $_ length is long enough, it works fine most of the time).

I assume it has to do with combining the multiple commands on one line and the particulars of how that's managed in memory... I'd love to know exactly what is happening and how that affects memory consumption or some sort of stack/register space etc...

Thanks in advance,


Replies are listed 'Best First'.
Re: Out of Memory
by davido (Cardinal) on Mar 27, 2013 at 18:12 UTC

    Those don't even do the same thing. In the first example, if the length of $_ is 100, in the end $nulls will contain the integer 100, after jumping through the pointless hoop of a pattern match.


      Hmm you're right those don't do the same things.

      Incidentially this also caused the out of memory error

      ($nulls) = $_ = /\0/g;

      however, I found another method that works and doesn't seem as likely to cause the extra memory overhead.

      while ($_ =~/\0/g) {$nulls++}

        The simplest, fastest and most efficient way to count the nulls (or any character) in a string is:

        my $nulls = $string =~ tr[\0][\0];

        Update: corrected '0' to '\0'. Thanks to davido.

        With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        "Science is about questioning the status quo. Questioning authority".
        In the absence of evidence, opinion is indistinguishable from prejudice.

        Whatever method you use, you're teetering on the edge. I would probably prefer taking in smaller chunks and processing them individually rather than trying to hold the entire thing in memory at the same time. Even if while( $_ =~ /\0/g ) { $null++ } keeps you below the mark, if your file grows by some small amount, you'll be back to bumping into the memory limit again.

        In other words, none of your methods really address the elephant in the corner, which is that holding the entire data set in memory at once is consuming all your wiggle-room.


Re: Out of Memory
by CountOrlok (Friar) on Mar 27, 2013 at 18:45 UTC
    The second piece of code will basically match all the nulls and put them into an anonymous array and then put the size of that array into $nulls. From a couple of tests I did using Devel::Size, an array with half a million nulls will take up about 320 MB. If you go any larger, this will quickly exhaust memory on a standard computer.
      Correction: an array of 5 million nulls will take up about 320MB.
        5 million nulls, even if it was double byte should take up less than 10 mb of memory no? If array overhead was double that, we're still at 20 mb of memory per copy?

        What is it about the workings of this second statement that causes it to consume so much memory?

        I'd like to understand how that works if anyone happens to know.

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://1025770]
Front-paged by Corion
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others browsing the Monastery: (12)
As of 2020-01-29 16:54 GMT
Find Nodes?
    Voting Booth?