http://www.perlmonks.org?node_id=356184

crabbdean has asked for the wisdom of the Perl Monks concerning the following question:

I have a class that includes an iterator method for outputing in a stream-like fashion. (tye's response in this thread shows pretty much how the code is constructed Re: Streaming to Handles (iterator)) Since this I've slightly amended the code to include some other "features". It all works beautifully although I've noticed it now leaks memory. I believe I've located the source of the memory leak although am not sure how to retain the features while effectively removing the memory leak. The below code highlights the problem.

I push an array reference to the $self->{files} array by using the below code:
my @array = (); push (@array, $rel_path); ## in my code more than one thing is pushed to @array. push (@{$self->{files}}, \@array);
(I'm sure that's where the problem lies!)

In the "next" iterator method I then "return" these references from the $self->{files} array:
sub next { my( $self )= @_; while( 1 ) { if( @{ $self->{files} } ) { my $file = shift @{ $self->{files} }; ## HERE! if( -d $file ) { push @{ $self->{dirs} }, $file; } return $file; } if( ! @{ $self->{dirs} } ) { return; } my $dir= shift @{ $self->{dirs} }; if( opendir( DIR, $dir ) ) { $self->{files}= [ map { File::Spec->catfile( $dir, $_ ); } File::Spec->no_upwards( readdir(DIR) ) ]; closedir DIR; } else { warn "opendir failed, $dir: $!\n"; } } }
... which I then dereference in my calling script:
my $file; while( $file = $f->next() ) { print "@{$_}\n"; }
Now I can see this method of pushing array references is leaving a reference count against the @array array ... and hence @array is never destroyed. This in time means RAM gets munched and for a long running script eventuates in my computer dying. But I need the feature of pushing a set of values (possbily as a array unless there is an alternative) into the $self->{files} array which I can then recall together as set

Is there a way to achieve this without leaving a reference count against @array (and hence removing the memory leak) ?

Does all of that make sense?

UPDATE: After writing this I layed down in bed and a thought suddenly struck me - Instead of making an @array to reference why not just push an anonymous array?
push (@{$self->{output}}, [$rel_path, $var1, $var2]);
I'll have to test this tomorrow but if anyone wishes to comment then please do so. :-)

Dean
The Funkster of Mirth
Programming these days takes more than a lone avenger with a compiler. - sam
RFC1149: A Standard for the Transmission of IP Datagrams on Avian Carriers

Replies are listed 'Best First'.
Re: Memory Management and Array references
by dave_the_m (Monsignor) on May 25, 2004 at 12:56 UTC
    You can avoid an extra refcount on @array by either wrapping it in in a narrow scope
    { my @array = ....; push (@{$self->{files}}, \@array); }
    but this is unlikely to be the source of your problem. Can you produce a complete but stripped down bit of code that reproduces this problem? Then we could run it on our own machines and advise you better. ie something that just initialises the array, then calls next repeatedly.
      Thanks Dave, although the @array is already in a small scope (the "next" iterator). The problem is the reference of @array that is pushed to a global array $self->{files}. The refcount made by that means @array sticks around until $self is destroy, that is IF garbage collection is even able to find and delete all references, which I doubt. Run that over a terrabyte server and I think this is my problem.

      Heading to bed now but I'll post some working code for you to test tomorrow. Note though, my included link Re: Streaming to Handles (iterator) contains pretty much the "guts" of the code except for the few extra lines I posted in my example.

      Dean
      The Funkster of Mirth
      Programming these days takes more than a lone avenger with a compiler. - sam
      RFC1149: A Standard for the Transmission of IP Datagrams on Avian Carriers
Re: Memory Management and Array references
by chromatic (Archbishop) on May 25, 2004 at 17:15 UTC

    It's not a memory leak. Perl is doing exactly what you tell it to do. When you're finished with $self->{files}, delete it.

      I'm sorry but that's not helpful at all. I showed an awareness that the problem was because of how I coded it and how Perl was handling the Memory Management based on how I coded it. $self->{files} is deleted when object goes out of scope. Its the abandoned references to the arrays within the $self->{files} that is the problem. Can you offer any advice on how to solve that problem? That would be helpful.

      Dean
      The Funkster of Mirth
      Programming these days takes more than a lone avenger with a compiler. - sam
      RFC1149: A Standard for the Transmission of IP Datagrams on Avian Carriers

        Any structure will be freed in perl when the last reference to it goes away; in your iterator, when you've exhausted the top-level directory and start working on the next you have:

        if( opendir( DIR, $dir ) ) { $self->{files}= [ map { File::Spec->catfile( $dir, $_ ); } File::Spec->no_upwards( readdir(DIR) ) ]; closedir DIR; } else { ... {

        At that point, when you assign a new arrayref to $self->{files}, the previous arrayref in that spot is overwritten and therefore its reference count is reduced - if that was the only reference to the previous arrayref, it will automatically be freed.

        So, to free unneeded memory in perl all you need to do - all you should do - is precisely to remove all references to the data.

        I suspect therefore that the problem is not where you think it is. I suggest that you step back and compose a new question - start off by telling us what makes you think you are leaking memory, and what in the process of tracking that down led you to believe the problem lay here.

        Hugo

Re: Memory Management and Array references
by crabbdean (Pilgrim) on Jul 19, 2004 at 22:02 UTC
    I like to try and do followups to my posts so that it serves as a solution database for others as much as possible. Often it takes some time though to find out what's going on before I post back a result.

    After, dare I say, about 100+ hours of debugging on this problem my code was looking pretty clean (well clean enough). I then found out about Devel::Peek the other day (I love this module BTW) and went through most of the script checking the creation and destruction of my objects and variables all the way through, checking reference counts etc. All appeared good. Needless to say by now I was reaching the height of frustration.

    It had occurred to me before that maybe what I was experiencing was a memory management issue with my OS(Win2000) so I re-image a PC at work with XP and ran my script on it across our Terabyte File Server. It ripped through the entire File Server in about 2 hours without breaking a sweat!!! Not a memory leak in sight.

    So there you have it, my kung-fu was good, my OS was not! A lesson well learnt.


    Dean
    The Funkster of Mirth
    Programming these days takes more than a lone avenger with a compiler. - sam
    RFC1149: A Standard for the Transmission of IP Datagrams on Avian Carriers