perlmeditation
tlm
<p>Recently, while writing a couple of [id://445513|nodes] on references (they were intended to be "instructive" though I'm afraid they turned out to be just tiring), I struggled with the question of whether to bring up the subject of autovivification, and if so how thoroughly. Since the nodes were already bloated, I punted, regretfully.</p>
<p>And, as it happens, it seems that this is the tack that most expositions of Perl references take. When autovivification is mentioned at all, it is to rave about its (undeniable) virtues, i.e. the Good. It's as if we are so eager to encourage the tremulous newbie to try riding the bicycle without the training wheels, that we don't want to dampen any enthusiasm with talk of potholes, and semis. Indeed, only [http://www.sysarch.com/perl/autoviv.txt|rarely] is the dark side of autovivification mentioned, let alone discussed at any length (hence the non-standard ordering of adjectives in this node's title, although I also like the appropriateness of the acronym that results from this reordering). This means that the new programmer, happily rolling along, secure in Perl's dwimitude, usually learns about the Bad and the Ugly sides of autovivification by crashing against a nasty bug.</p>
<p>That's what happened to me...</p>
<readmore>
<p>I was a much younger programmer then...(insert your favorite flashback effects here). The software I was writing was for doing some statistical analysis on large collections of entities and their attributes (also numerous). I was using "association tables", that were implemented as HoHs whose primary keys were entity ids and secondary keys were attribute ids (or viceversa). (I was using <code>undef</code> as the value of all these ordered pairs, which probably didn't help.)</p>
<p>Here's the Bad. Consider the following snippet:
<code>
use strict;
# ...
my $exists =
exists $hoh{ typo }{ attrib_1 }; # strict can't hear you scream...
# ... life goes on
my $number_of_entities = keys %hoh; # BONK!
</code>
The count in <code>$number_of_entities</code> is off by 1 (at least), because now it includes the bogus entity 'typo'. </p>
<p>Or consider this one:
<code>
my @big_in_assoc_1 = grep keys %{$assocs_1{$_}} > 25, keys %assocs_2;
# ... tics later
my @relative_complement = grep !exists %assocs_1{$_}, keys %assocs_2; # OUCH!
</code>
The first line above collects all the entities from table <code>%assocs_2</code> that have more than 25 attributes in table <code>%assocs_1</code>, but in the process potentially autovivifies any number of empty hashes in <code>%assocs_1</code> (namely those corresponding to entities in <code>%assocs_2</code> that were not originally in <code>%assocs_1</code>). So <code>@relative_complement</code> above is always empty.</p>
<p>Of course, to the hardened Perl programmer, the lines above are plainly foolish, just asking for it. But to the greenhorn they look pretty reasonable, cool even.</p>
<p>Those were days of interminable debugging, of endless wading through the muck with <code>DB</code>'s <code>s</code>. We went nuts, and some of us never recovered. My buddy... my buddy... Last time I heard of him he was programming Python somewhere out West.</p>
<p>Those of us who pulled through have had to learn to live with the Ugly. Gone are the carefree days when <code>keys</code> was my trusted friend, the only tool I needed to find the size of a table:
<code>
my $number_of_entities = grep defined $hoh{$_}, keys %hoh;
</code>
Now I know the insidious treachery <code>keys</code> is capable of:
<code>
my @big_in_assoc_1 =
grep $assocs_1{$_} && keys %{$assocs_1{$_}} > 25,
keys %assocs_2;
</code>
There's more, but these painful memories are mostly blocked.</p>
</readmore>
<div class="pmsig"><div class="pmsig-439528">
<p><small>the lowliest monk</small></p>
</div></div>