Beefy Boxes and Bandwidth Generously Provided by pair Networks
There's more than one way to do things

Object reference disappearing during global destruction

by khkramer (Scribe)
on May 15, 2002 at 17:06 UTC ( #166791=perlquestion: print w/replies, xml ) Need Help??
khkramer has asked for the wisdom of the Perl Monks concerning the following question:

I've been hunting a nasty bug for the last two days. The short version is that one field -- containing an object reference -- from some of my hashed-based objects is getting undef'ed at some seemingly-arbritrary time during global destruction. Here is some pretend code that illustrates the problem; if this pseudo-code exhibited the same behavior as my buggy code, running this snippet would trigger the die in the Badly_Behaved package's destroy (I'm running perl 5.6.1 under a 2.4.x GNU/Linux):

package Badly_Behaved; sub new { my $class = shift; my $self = {}; bless ( $self, $class ); $self->{_Some_object_ref} = Foo::Get_other_object_ref(); } sub DESTROY { die 'terrible problem!' if ! defined $self->{_Some_object_ref}; } package main; $hmph = Badly_Behaved->new();

Of course, in real life things aren't so simple. This bug happens during global destruction of an XML::Comma execution environment: several dozen objects (at the minimum) are being cleaned up. The vast majority of the time, there are no ill effects. Even though a few fields seem to be disappearing before they should, destructors are usually called in an order that doesn't trigger any problematic behavior.

But not quite always. We isolated a test-case yesterday. Here is the original oddball snippet (courtesy of Eric Loeb) that I started working from:

#!/usr/bin/perl -w use strict; $|++; use XML::Comma; my $DOWNLOAD_FILE = "DOWNLOAD_TIME"; my $ldt = 1021398860; my $local_users = XML::Comma::Def->read(name=>'W_User')->get_index('main')-> iterator(where_clause=>"record_last_modified > $ldt"); while ($local_users++) { } sub dump_log { }

which produces the following error:

(in cleanup) Can't call method "def_by_name" on an undefined value at /usr/lib/perl5/site_perl/5.6.1/XML/Comma/ line 288 during global destruction.

The object reference that the cleanup code is looking for has disappeared. Removing the empty dump_log subroutine definition eliminates the error message. Changing the $local_users post-increment to a pre-increment eliminates the error message. Lots of other tiny, apparently-unrelated changes eliminate the error message.

After much wailing and gnashing of print statements, we boiled the problem down to the "undef'ing reference" issue. The critical references always go away during global destruction earlier than it looks like they should, but the itty bitty changes have the effect of re-arranging the order in which destructors are called, and in most orderings the reference disappearance doesn't cause any problems. Here's the slimmed down test script:

#!/usr/bin/perl -w use strict; $|++; use XML::Comma; $::index = XML::Comma::Def->read(name=>'AllAfrica_NewsStory')->get_index('post' +); print "-->index/def: " $::index . ' ' . $::index->{_def} . "\n";

The my'ed variable has moved to the main package, so that it's easy to peek at it from various other places. The last line is only there so that I have a convenient sanity check and place to halt the debugger. This code does not produce any errors, but I've got everything instrumented so that I can see the $::index->{_def} reference disappearing during global destruction.

At this point, I've added an explicit DESTROY for every object in the system, so that I can stick in print statements and/or have a place to halt the debugger. During global destruction, everything trips merrily along, with 70-odd objects passing peacefully into the night. Then -- in some way that I haven't yet been able to pinpoint -- the $index->{_def} reference becomes ! defined. This is the only field in the $index object that loses its content. It's also the only field that is an object reference; the others are scalars, hash refs and array refs.

The debugger (run with its inhibit_exit option set to 0, to enabling tracing through global destruction), doesn't show any code changing the field's content. Setting a Watch on $index->{_def} does stop the debugger when $index->{_def} becomes undefined, but no statements that show up in the trace ever do anything to that field. It's like some invisible hand reaches into the system between two of the many, many DESTROY calls, and yanks out that (and only that) reference. The object that $index->{_def} points to is not destroyed until sometime after the reference disappears (as one would expect).

Here's a little snippet of my debugging output, showing what happens:

D: XML::Comma::NestedElement=HASH(0x86eeb6c) XML::Comma::Bootstrap=HASH(0x86a2084) D: XML::Comma::Element=HASH(0x86ef81c) XML::Comma::Bootstrap=HASH(0x86a2084) D: XML::Comma::Element=HASH(0x86ed1e4) <undef> XML::Comma::Indexing::Index=HASH(0x86ed268) _Index_sorts --> HASH(0x86f8abc) _def --> <undef> _Index_columns_pos --> 13 _nested_elements --> ARRAY(0x86eb7c8) _tag --> index _Index_doctype --> AllAfrica_NewsStory _init_index --> 38 _nested_lookup_table --> HASH(0x86eb7d4) _attrs --> HASH(0x86ed1f0) _Hookable_index_hooks --> ARRAY(0x86f77f0) _Hookable_stop_rebuild_hooks --> ARRAY(0x86f8a08) _Index_store_type --> post _tag_up_path --> DocumentDefinition:index _Index_bcollections --> HASH(0x86fc9b4) DBH_connect_check --> _check_db Doc_storage --> HASH(0x86f7688) _Index_columns --> HASH(0x86f8a2c) D: XML::Comma::NestedElement=HASH(0x86eea58) <undef>

Each DESTROY normally prints out two lines. The D: lines print out the object being destroyed, and the following line prints out $::index->{_def}. The first time the destroy finds $::index->{_def} to be undefined, I dump all of the object's fields. It's worth noting that there's still a reference elsewhere in the system to the Bootstrap object, and it's retreivable through that ref on past the point where $::index->{_def} becomes undefined.

And here's the same thing from inside the debugger, showing that no statements other than the debugging lines in the DESTROY are running, yet the value of the field changes:

D: XML::Comma::NestedElement=HASH(0x8ae51c4) XML::Comma::AbstractElement::DESTROY(/u/khkramer/src/perl/XML-Comma/XM +L/Comma/ 224: print ' ' . ($::index->{_def}||'<undef>')."\n"; XML::Comma::Bootstrap=HASH(0x8982c88) XML::Comma::AbstractElement::DESTROY(/u/khkramer/src/perl/XML-Comma/XM +L/Comma/ 226: if ( (! defined $::index->{_def}) && (! $::index_dumped)) { XML::Comma::AbstractElement::DESTROY(/u/khkramer/src/perl/XML-Comma/XM +L/Comma/ 223: print 'D: ' . $_[0] . "\n"; D: XML::Comma::Element=HASH(0x8ae6e24) XML::Comma::AbstractElement::DESTROY(/u/khkramer/src/perl/XML-Comma/XM +L/Comma/ 224: print ' ' . ($::index->{_def}||'<undef>')."\n"; XML::Comma::Bootstrap=HASH(0x8982c88) XML::Comma::AbstractElement::DESTROY(/u/khkramer/src/perl/XML-Comma/XM +L/Comma/ 226: if ( (! defined $::index->{_def}) && (! $::index_dumped)) { XML::Comma::AbstractElement::DESTROY(/u/khkramer/src/perl/XML-Comma/XM +L/Comma/ 223: print 'D: ' . $_[0] . "\n"; D: XML::Comma::Element=HASH(0x8ae5110) XML::Comma::AbstractElement::DESTROY(/u/khkramer/src/perl/XML-Comma/XM +L/Comma/ 224: print ' ' . ($::index->{_def}||'<undef>')."\n"; XML::Comma::Bootstrap=HASH(0x8982c88) XML::Comma::AbstractElement::DESTROY(/u/khkramer/src/perl/XML-Comma/XM +L/Comma/ 226: if ( (! defined $::index->{_def}) && (! $::index_dumped)) { Watchpoint 0: $::index->{_def} changed: old value: 'XML::Comma::Bootstrap=HASH(0x8982c88)' new value: undef XML::Comma::AbstractElement::DESTROY(/u/khkramer/src/perl/XML-Comma/XM +L/Comma/ 223: print 'D: ' . $_[0] . "\n";

I'm getting close to running out of ideas on this one, and am really hoping that someone else has seen a similar problem. (Or that there's some painfully-obvious thing about reference counting, destroy semantics or the debugger that I'm missing.)


Replies are listed 'Best First'.
Re: Object reference disappearing during global destruction
by chromatic (Archbishop) on May 15, 2002 at 17:14 UTC
    It's like some invisible hand reaches into the system between two of the many, many DESTROY calls, and yanks out that (and only that) reference.

    Yep. Unfortunately, global destruction is a free-for-all, and you can't count on any sane order of destruction. Perhaps the best option you have is to add an explicit non-DESTROY destructor called manually. (If you're clever, tie it to one of the end-of-scope-action modules on the CPAN, pun intended.)

    Reference counting is all well and good, but at the end of the interpreter, Perl yells "Everybody out of the pool!" and occasionally someone'll leave his little locker key behind in the ensuing madness.

      But the interpreter isn't cleaning up the object, it's setting the reference to undef. The object is still around (and accessible by other means).

      I like your recreational-swim analogy, but I'm still trying to figure out how the behavior I'm seeing is consistent with the specified operation of the garbage collector. The Camel book says that objects are always destroyed in a separate pass before ordinary references (3rd ed. page 331), and maybe that's a clue, but there's still a problem:

      When the interpreter is ready to exit, presumably "everything" goes out of scope. At that point, there are no references left to my $::index object. So it should be garbage collected pretty early. But it's not and, worse, an object reference it's holding is set to undef at some apparently arbritrary moment. I've been very careful to avoid circular references, so there *is* a sensible destruction order for these objects. But if references disappear without provocation, then that order can't be maintained.

        Maybe Elian will correct me if I'm completely wrong here, but reference counting isn't all that useful during global destruction. First, how do you figure out which items to collect first? Second, how do you deal with circular references? Third, if everything's going to be cleaned up anyway, does it matter? Fourth, it takes a lot longer to come up with a grand master plan than it does to pick a corner of the pool and start sweeping.

        As much as I'd like to believe that you can count on objects being destroyed, during global destruction, in normal, reference-counted order, I've never seen it happen. When it's mattered, I've always had to start the chain reaction myself.

        Let me give it a shot.

        First off, let me make sure I understand. You have a reference to an object, and you are dereferencing that ref in print statements, wondering whats going on, right?

        If you have a reference to an object, then the destruction of that reference will not invoke a call to a DESTROY method, only the objects actual destruction will. The objects destruction will occur in a pass before destruction of anything else. If you have a reference to an object, and the object is destroyed first (which it should be) the the reference points to nothing - undef. If the the scalar holding the reference to the object is itself undefed, and the object still exists, then this may be inconsistent with Perl's two pass out-of-the-pool system (where objects go first, so that their DESTROY methods can safely operate on variables in the environment). But, if you simply have a reference to the object, and you dereference that reference, then it should come up undef after global object destruction and before destruction of everything else.

        Hope that helps.


        Update: I just realized this is my 100th post. Yay!

Re: Object reference disappearing during global destruction
by kal (Hermit) on May 15, 2002 at 18:06 UTC

    You really need to clean up yourself - i.e., assuming your objects are already in a hierarchy, or can be put in a hierarchy, then destroy them yourself. If you're working with a collection of objects, for example, maintain a global array or something and keep a list of your objects in there. Then go through and remove them yourself before Perl does.

    The problem with removing objects is when they refer to each other - obviously, the destructor needs to stop somewhere and start removing things. If you can guarantee that doesn't happen, you can write some simple code to destroy the objects before they get de-allocated. You then get to pick what order stuff happens. Presumably your objects are some kind of tree structure, and you can jsut start at the leaves and work inwards. If you have something more complicated than that, you need to think about the order in which things will get removed - perhaps you have some corner-case data structure which is causing problems (e.g. a loop)?

      I actually do a lot of clean-up management. There are 10k lines of code in this set of modules, and many dozens of objects get created at the beginning (and destroyed at the end) of even a one-liner invocation. But I've always depended -- at bottom -- on the basic reference-counting rules to make sure that object destruction happens in a safe order (and assumed that this should work even during global destruction). Avoiding circular references and other destroy-time problems has always been a big concern: making sure there are no memory leaks in this code was a primary goal from the very beginning of development.

      But I would tend to think that you're right about there being some corner-case oddity here. I guess, to re-state the question, what could cause the following sequence of events:

      1. during normal operation: $foo->{bar} => Some::Object::Ref=HASH(0x86ed4c0)
      2. global destruction begins
      3. at some point during global destruction $foo->{bar} = undef happens -- without any statement to that effect being executed, (and Some::Object::Ref=HASH(0x86ed4c0) is still around and accessible by other means).

      The debugger shows that no code undefs $::index->{_def}. So, either:

      1. The interpreter is allowed to reach in and yank references out from under objects during global destruction. (And perhaps this is what chromatic is saying about throwing reference counting by the board when the interpreter is ready to exit.)
      2. The debugger is wrong.
      3. Or, I'm an idiot, and there's something quite different going on here that I've managed to miss completely.
Re: Object reference disappearing during global destruction
by Anonymous Monk on May 15, 2009 at 10:21 UTC
    I'm experiencing the same issue. A pair of references are disappering from a hash. The keys that contain the missing references are always the same, independently of the number of keys of the hash. Can anyone throw a bit of light on the subject? Is this a programming error (which I'm missing) or a bug in the garbage collector like someone has suggested? Thanks
      I had the exact same issue but in a mod_perl enviro. After a few days I found the following in our memoization routines to be the problem:
      return join '', map { $_->{type} eq 'static' ? $_->{body} : $self->execute($_) } @{$self->{BODY}};
      changing it to the following fixes the havoc:
      my @bodies = @{$self->{BODY}}; return join '', map { $_->{type} eq 'static' ? $_->{body} : $self->execute($_) } @bodies;

      dereferencing and iterating over the deref'ed data with maps do untowardly things to item refcounts and the gc reaps its victims when it pleases.

      execute is just a nice wrapper for an eval. it touches nothing inappropriately.

      hope this helps a fellow weary soul.

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://166791]
Approved by boo_radley
and all is quiet...

How do I use this? | Other CB clients
Other Users?
Others perusing the Monastery: (3)
As of 2018-07-18 01:21 GMT
Find Nodes?
    Voting Booth?
    It has been suggested to rename Perl 6 in order to boost its marketing potential. Which name would you prefer?

    Results (383 votes). Check out past polls.