Beefy Boxes and Bandwidth Generously Provided by pair Networks
P is for Practical
 
PerlMonks  

Memory leaks and reference counting within Perl.

by DigitalKitty (Parson)
on Jan 05, 2005 at 22:38 UTC ( #419754=perlquestion: print w/ replies, xml ) Need Help??
DigitalKitty has asked for the wisdom of the Perl Monks concerning the following question:

Hi all.

I'm requesting a little clarification with the concept of a memory leak and reference counting. As I understand it, Perl does not use either 'mark and sweep' or anything resembling the JVM 'garbage collector'. Rather, it increments / decrements reference counts in order to ascertain when memory should be re-claimed. See below:

my $hash_ref; { my %monks = ( Zaxo => 'W. Virginia', tye => 'California', davido => 'California', theorbtwo => 'Germany', castaway => 'Germany', atcroft => 'Georgia', nothingmuch => 'Israel', rozallin => 'England', ); my $hash_ref = \%monks; }


When this block ends, the %monks hash is no longer in scope so the memory it occupied is re-claimed. However, since $hash_ref was not declared inside the block and it has been initialized with the address of the %monks hash, it remains in scope (thereby creating a memory leak)?

One may still perform a variety of operations on the hash reference (and consequently the hash itself). Is this correct? To reduce the reference count to 'zero' (eliminating any potential memory leak that is present), I could simply set the $hash_ref variable to 'undef'(?)

Thanks,

~Katie

Comment on Memory leaks and reference counting within Perl.
Download Code
Re: Memory leaks and reference counting within Perl.
by davido (Archbishop) on Jan 05, 2005 at 22:57 UTC

    Actually there is no memory leak in your code. You're redeclaring $hash_ref inside the inner block, and it falls out of scope after the last curly bracket. The fact that a variable by the same name has been declared at a broader scope too is irrelevant. The broader-scoped $hash_ref is never being assigned anything, as it is masked by the inner-scoped $hash_ref.

    Now on the other hand, if you made an assignment to $hash_ref without declaring it at the inner scope, after the last curly bracket $hash_ref wouldn't fall out of scope, and the reference count wouldn't drop to zero until the broad-scoped $hash_ref lexical falls out of scope too... in this case at the end of the program.

    This isn't a memory leak unless you lose track of $hash_ref without letting it fall out of scope somehow.


    Dave

Re: Memory leaks and reference counting within Perl.
by habit_forming (Monk) on Jan 05, 2005 at 23:17 UTC
    There are a couple of things happening here.

    1) Inside of the closure the var $hashref is re-my'd and thus is a completely new variable, therefore no reference to %monks is transmitted outside the closure. This means that $hashref is undef outside the closure.

    2) If you remove the second my and still initialize $hashref to point to %monks then the memory will not be cleaned up because there still exists a variable that has a reference pointing to it so its reference count is not zero but the memory that was previously refered to as %monks is no longer accessible by that name but is accessible by %$hash_ref. Now, if you make $hash_ref = undef; then the final reference to the memory previously labeled %monks will finally go away thus making the reference count to that piece of memory zero and Perl will clean it up if it feels it needs to, otherwise that memory will just go back into the pool to be reused.

    At least that is the way I understand it. Those with more internals experience may prove me wrong. =)

    Cheers!
    --habit
Re: Memory leaks and reference counting within Perl.
by Joost (Canon) on Jan 05, 2005 at 23:34 UTC
    I'll just pretend the 3rd "my" in your code isn't there. Others have already pointed it out...

    One may still perform a variety of operations on the hash reference (and consequently the hash itself). Is this correct?

    Yes, and that's why this isn't a memory leak :-) Java has the same behaviour in this case: if you can get at it, it's kept in memory.

    To reduce the reference count to 'zero' (eliminating any potential memory leak that is present), I could simply set the $hash_ref variable to 'undef'(?)
    You could. You could also let the reference go out of scope, and it will be cleaned up automatically. The only problem with a reference counting garbage collector are circular references. Like this:

    { my %monks = ( Zaxo => 'W. Virginia', tye => 'California', theorbtwo => 'Germany', castaway => 'Germany', atcroft => 'Georgia', rozallin => 'England', ); # %monks has a reference count of 1 my $ref = \%monks; # $ref has a reference count of 1 # %monks has a reference count of 2 $monks{myself} = \%monks; # %monks has a reference count of 3 } # leaving the scope decrements the reference count # for %monks and $ref by one. # $ref is collected (has a refcount of 0) # - decrement the reference count for monks # by one again # "%monks" has a reference count of 1 # and is not garbage collected

    The reason %monks is not collected is because there still is a reference to %monks in $monks{myself}. $monks{myself} would be removed if %monks were collected, but there is still a reference to %monks in $monks{myself}... etc...

    A mark-and-sweep collector can detect these circular references and would also collect %monks here. Perl doesn't (except at the end of execution).

Re: Memory leaks and reference counting within Perl.
by dmitri (Curate) on Jan 05, 2005 at 23:41 UTC

      http://www.perlmonks.org/?node_id=336883

      Or for those who prefer the simple automation of clickable links, that is: Re: undefining hashes to free memory ([id://336883] is how we do links around here ;) ).

      Now back to your regular programming, currently in progress...


      Dave

Re: Memory leaks and reference counting within Perl.
by ysth (Canon) on Jan 06, 2005 at 03:09 UTC
    Consider this code:
    my @AoH; for (0..9) { my %monks = ( Zaxo => 'W. Virginia', tye => 'California', theorbtwo => 'Germany', castaway => 'Germany', atcroft => 'Georgia', rozallin => 'England', ); print "\%monks is at address ", 0+\%monks, "\n"; # save a reference every other iteration $AoH[$_/2] ||= \%monks; } print "\%{\$AoH[$_]} is at address ", 0+$AoH[$_], "\n" for 0..$#AoH; __END__ %monks is at address 269696284 %monks is at address 269696716 %monks is at address 269696716 %monks is at address 269696812 %monks is at address 269696812 %monks is at address 269696908 %monks is at address 269696908 %monks is at address 269697004 %monks is at address 269697004 %monks is at address 269706344 %{$AoH[0]} is at address 269696284 %{$AoH[1]} is at address 269696716 %{$AoH[2]} is at address 269696812 %{$AoH[3]} is at address 269696908 %{$AoH[4]} is at address 269697004
    This shows that when some reference to the my %monks outlives the block, my %monks is allocated a new hash, and otherwise it reuses the same space.

    When the "leaked" references become undefined (by the array elements being set to something else or the array going out of scope) the hashes referred to are indeed freed, unless a circular reference is created (e.g. $AoH[0]{ref}=$AoH[0]).

Re: Memory leaks and reference counting within Perl.
by matra555 (Monk) on Jan 06, 2005 at 08:14 UTC
    Hey DK.

    Total shot in the dark here, but...perhaps Data::Dumper might be of some help?

    UPDATE: with total credit to castaway, here's this answer

    'using data::dumper or print after your loop, you would have noticed that $hash_ref is empty'

      Total shot in the dark here, but...perhaps Data::Dumper might be of some help?

      Not unless you have some need to dump the datastructure. Doesn't have much to do with the question of lexical scoping, reference counts, and garbage collection though.


      Dave

        Well there is the point that using a module such as Data::Dumper or Data::Dump::Streamer (with the former in Purity() mode) you would at least readily notice any cyclic references in the data structure.

        ---
        demerphq

Re: Memory leaks and reference counting within Perl.
by hv (Parson) on Jan 06, 2005 at 08:45 UTC

    I think there may be confusion about what constitutes a "memory leak". A memory leak occurs if memory is not freed at the point the last reference to that memory goes away. There are 3 ways to get a memory leak in a perl program: construct a circular reference (as described by other comments above); find and tickle a bug in perl's reference counting; or find and tickle a bug in some other external non-perl code (such as an XS library).

    It is possible to write a program that consumes more memory than it needs to because it keeps unnecessary references to data, thus making the memory unavailable for reclamation. But that is simply an inefficiency in the program (often fixable by more carefully restricting the scope of variables), not a memory leak.

    Hugo

      There are 3 ways to get a memory leak in a perl program: construct a circular reference (as described by other comments above); find and tickle a bug in perl's reference counting; or find and tickle a bug in some other external non-perl code (such as an XS library).
      And those potential bugs in the perl interpreter and XS libraries are IMO the biggest drawback of a reference counting scheme. Circular references in perl are usually easily avoided (you can now even use weakrefs with the standard distribution if you need to) - but when you're writing extensions or modifying the interpreter you need to get the reference count correct in all C/C++ code using perl's data types. This is extremely brittle, when you consider passing subroutine arguments, changes in scope, stuffing data in lexicals, mortal SV*s etc etc. At least, I usually mess it up hard :-)

      As far as I understand it, a "real" garbage collector can replace most of this mess with a centralized (but complicated) garbage collection algorithm. at the cost of (usually) not guaranteeing a specific destroy time for objects (perl objects are at the moment guaranteed to be DESTROYED at the exact moment they go out of scope). AFAIK perl 6 is going to implement a scheme like this - which means you won't be able to count on "timely destruction" in perl 6 (at least, last time I looked).

Re: Memory leaks and reference counting within Perl.
by Aragorn (Curate) on Jan 06, 2005 at 09:45 UTC
    Devel::Peek enables you to see the reference counts of variables, among a lot of other things. You need to understand some of what goes on "under the hood", for which I found Simon Cozens' Perl 5 Internals tutorial very useful.

    I think that playing with Dump (which is a Devel::Peek function) will shed some light on how things work.

    Arjen

Re: Memory leaks and reference counting within Perl.
by nothingmuch (Priest) on Jan 09, 2005 at 09:15 UTC
    Well, I have one objection... You left me out of your hash. ;-)

    Manymonks have mentioned your second my, and the fact that circular refs are needed to create leaks.

    What I have to add is a module I recently became aware of, Data::Structure::Util, which is an easy way to get your circular refs cleaned out - circular_off($structure). Less recently useful were Devel::Cycle and Test::Memory::Cycle, which is based on the former. I always Test::Memory::Cycle to make sure that I've not overlooked something, and also to make sure, before I Scalar::Util::weaken, that the circular ref really exists.

    There are also some modules that keep track of objects surviving into global destruction, like Devel::ObjectTracker or Devel::Leak::Object, but these I've found to be less useful, because they're a bit more of a hassle.

    -nuffin
    zz zZ Z Z #!perl

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://419754]
Approved by Paladin
Front-paged by friedo
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others musing on the Monastery: (5)
As of 2014-08-30 00:47 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    The best computer themed movie is:











    Results (290 votes), past polls