perlquestion
khkramer
<p> I've been hunting a nasty bug for the last two days. The short
version is that one field -- containing an object reference -- from
some of my hashed-based objects is getting undef'ed at some
seemingly-arbritrary time during global destruction. Here is some
pretend code that illustrates the problem; if this pseudo-code
exhibited the same behavior as my buggy code, running this snippet
would trigger the die in the Badly_Behaved package's destroy (I'm
running perl 5.6.1 under a 2.4.x GNU/Linux): </p>
<readmore>
<code>
package Badly_Behaved;
sub new {
my $class = shift;
my $self = {};
bless ( $self, $class );
$self->{_Some_object_ref} = Foo::Get_other_object_ref();
}
sub DESTROY {
die 'terrible problem!' if ! defined $self->{_Some_object_ref};
}
package main;
$hmph = Badly_Behaved->new();
</code>
<p> Of course, in real life things aren't so simple. This bug happens
during global destruction of an XML::Comma execution environment:
several dozen objects (at the minimum) are being cleaned up. The vast
majority of the time, there are no ill effects. Even though a few
fields seem to be disappearing before they should, destructors are
usually called in an order that doesn't trigger any problematic
behavior. </p>
<p> But not quite always. We isolated a test-case yesterday. Here is
the original oddball snippet (courtesy of Eric Loeb) that I started
working from: </p>
<code>
#!/usr/bin/perl -w
use strict;
$|++;
use XML::Comma;
my $DOWNLOAD_FILE = "DOWNLOAD_TIME";
my $ldt = 1021398860;
my $local_users =
XML::Comma::Def->read(name=>'W_User')->get_index('main')->
iterator(where_clause=>"record_last_modified > $ldt");
while ($local_users++) {
}
sub dump_log {
}
</code>
<p>which produces the following error:</p>
<code>
(in cleanup) Can't call method "def_by_name" on an
undefined value at
/usr/lib/perl5/site_perl/5.6.1/XML/Comma/NestedElement.pm
line 288 during global destruction.
</code>
<p> The object reference that the cleanup code is looking for has
disappeared. Removing the empty dump_log subroutine definition
eliminates the error message. Changing the $local_users post-increment
to a pre-increment eliminates the error message. Lots of other tiny,
apparently-unrelated changes eliminate the error message. </p>
<p> After much wailing and gnashing of print statements, we boiled the
problem down to the "undef'ing reference" issue. The critical
references always go away during global destruction earlier than it
looks like they should, but the itty bitty changes have the effect of
re-arranging the order in which destructors are called, and in most
orderings the reference disappearance doesn't cause any
problems. Here's the slimmed down test script: <p>
<code>
#!/usr/bin/perl -w
use strict;
$|++;
use XML::Comma;
$::index =
XML::Comma::Def->read(name=>'AllAfrica_NewsStory')->get_index('post');
print "-->index/def: " $::index . ' ' . $::index->{_def} . "\n";
</code>
<p> The my'ed variable has moved to the main package, so that it's
easy to peek at it from various other places. The last line is only
there so that I have a convenient sanity check and place to halt the
debugger. This code does not produce any errors, but I've got
everything instrumented so that I can see the $::index->{_def}
reference disappearing during global destruction. </p>
<p> At this point, I've added an explicit DESTROY for every object in
the system, so that I can stick in print statements and/or have a
place to halt the debugger. During global destruction, everything
trips merrily along, with 70-odd objects passing peacefully into the
night. Then -- in some way that I haven't yet been able to pinpoint --
the $index->{_def} reference becomes ! defined. This is the only field
in the $index object that loses its content. It's also the only field
that is an object reference; the others are scalars, hash refs and
array refs. </p>
<p> The debugger (run with its inhibit_exit option set to 0, to
enabling tracing through global destruction), doesn't show any code
changing the field's content. Setting a Watch on $index->{_def} does
stop the debugger when $index->{_def} becomes undefined, but no
statements that show up in the trace ever do anything to that
field. It's like some invisible hand reaches into the system between
two of the many, many DESTROY calls, and yanks out that (and only
that) reference. The object that $index->{_def} points to is not
destroyed until sometime after the reference disappears (as one would
expect). </p>
<p> Here's a little snippet of my debugging output, showing what
happens: </p>
<code>
D: XML::Comma::NestedElement=HASH(0x86eeb6c)
XML::Comma::Bootstrap=HASH(0x86a2084)
D: XML::Comma::Element=HASH(0x86ef81c)
XML::Comma::Bootstrap=HASH(0x86a2084)
D: XML::Comma::Element=HASH(0x86ed1e4)
<undef>
XML::Comma::Indexing::Index=HASH(0x86ed268)
_Index_sorts --> HASH(0x86f8abc)
_def --> <undef>
_Index_columns_pos --> 13
_nested_elements --> ARRAY(0x86eb7c8)
_tag --> index
_Index_doctype --> AllAfrica_NewsStory
_init_index --> 38
_nested_lookup_table --> HASH(0x86eb7d4)
_attrs --> HASH(0x86ed1f0)
_Hookable_index_hooks --> ARRAY(0x86f77f0)
_Hookable_stop_rebuild_hooks --> ARRAY(0x86f8a08)
_Index_store_type --> post
_tag_up_path --> DocumentDefinition:index
_Index_bcollections --> HASH(0x86fc9b4)
DBH_connect_check --> _check_db
Doc_storage --> HASH(0x86f7688)
_Index_columns --> HASH(0x86f8a2c)
D: XML::Comma::NestedElement=HASH(0x86eea58)
<undef>
</code>
<p> Each DESTROY normally prints out two lines. The D: lines print out
the object being destroyed, and the following line prints out
$::index->{_def}. The first time the destroy finds $::index->{_def} to
be undefined, I dump all of the object's fields. It's worth noting
that there's still a reference elsewhere in the system to the
Bootstrap object, and it's retreivable through that ref on past the
point where $::index->{_def} becomes undefined. </p>
<p> And here's the same thing from inside the debugger, showing that
no statements other than the debugging lines in the DESTROY are
running, yet the value of the field changes: </p>
<code>
D: XML::Comma::NestedElement=HASH(0x8ae51c4)
XML::Comma::AbstractElement::DESTROY(/u/khkramer/src/perl/XML-Comma/XML/Comma/AbstractElement.pm:224):
224: print ' ' . ($::index->{_def}||'<undef>')."\n";
XML::Comma::Bootstrap=HASH(0x8982c88)
XML::Comma::AbstractElement::DESTROY(/u/khkramer/src/perl/XML-Comma/XML/Comma/AbstractElement.pm:226):
226: if ( (! defined $::index->{_def}) && (! $::index_dumped)) {
XML::Comma::AbstractElement::DESTROY(/u/khkramer/src/perl/XML-Comma/XML/Comma/AbstractElement.pm:223):
223: print 'D: ' . $_[0] . "\n";
D: XML::Comma::Element=HASH(0x8ae6e24)
XML::Comma::AbstractElement::DESTROY(/u/khkramer/src/perl/XML-Comma/XML/Comma/AbstractElement.pm:224):
224: print ' ' . ($::index->{_def}||'<undef>')."\n";
XML::Comma::Bootstrap=HASH(0x8982c88)
XML::Comma::AbstractElement::DESTROY(/u/khkramer/src/perl/XML-Comma/XML/Comma/AbstractElement.pm:226):
226: if ( (! defined $::index->{_def}) && (! $::index_dumped)) {
XML::Comma::AbstractElement::DESTROY(/u/khkramer/src/perl/XML-Comma/XML/Comma/AbstractElement.pm:223):
223: print 'D: ' . $_[0] . "\n";
D: XML::Comma::Element=HASH(0x8ae5110)
XML::Comma::AbstractElement::DESTROY(/u/khkramer/src/perl/XML-Comma/XML/Comma/AbstractElement.pm:224):
224: print ' ' . ($::index->{_def}||'<undef>')."\n";
XML::Comma::Bootstrap=HASH(0x8982c88)
XML::Comma::AbstractElement::DESTROY(/u/khkramer/src/perl/XML-Comma/XML/Comma/AbstractElement.pm:226):
226: if ( (! defined $::index->{_def}) && (! $::index_dumped)) {
Watchpoint 0: $::index->{_def} changed:
old value: 'XML::Comma::Bootstrap=HASH(0x8982c88)'
new value: undef
XML::Comma::AbstractElement::DESTROY(/u/khkramer/src/perl/XML-Comma/XML/Comma/AbstractElement.pm:223):
223: print 'D: ' . $_[0] . "\n";
</code>
<p> I'm getting close to running out of ideas on this one, and am
really hoping that someone else has seen a similar problem. (Or that
there's some painfully-obvious thing about reference counting, destroy
semantics or the debugger that I'm missing.) </p>
<p>Kwin</p>