Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl-Sensitive Sunglasses
 
PerlMonks  

Why does exists cause autovivication?

by Argel (Prior)
on Dec 29, 2007 at 03:00 UTC ( #659432=perlquestion: print w/ replies, xml ) Need Help??
Argel has asked for the wisdom of the Perl Monks concerning the following question:

I got bit by 'exists' causing autovivication again. It seems that every year I forget about it at least once. I would have thoguht this would have been "fixed" in Perl 5.10 since Perl certainly isn't doing what I mean.

So now I'm curious -- what's the rationale behind it? Is it just something hard to code around in the Perl 5 codebase? Or something else? Would it be possible to have perl throw a warning in a future 5.10 (or 5.12) release?

Obligatory code snippets....

# Broken version: next LASTLOG unless exists $user_by_uid{$uid}->{$host}; # Data::Dumper output when $uid is not in the hash: $user_by_uid = { '93688' => {}, '58684' => {}, '58017' => {}, }; # Fixed version: next LASTLOG unless exists $user_by_uid{$uid} && exists $user_by_uid{$ +uid}->{$host};

Update: I'd like to thank everyone for the replies. I was interested in the history of why it was designed/implemented this way and perrin gave me some things to look at in 659433. I'd like to thank ikegami for his answers in the 659446 thread and special thanks to tye for his responses in Re: Why does exists cause autovivication? (myth, mods) and 659643 respectively. Again, thanks everyone!

Update2: I'd especially like to thank demerphq for his reply in: "Re^7: Why does exists cause autovivication?"!

Happy Holidays!

Comment on Why does exists cause autovivication?
Download Code
Re: Why does exists cause autovivication?
by perrin (Chancellor) on Dec 29, 2007 at 03:06 UTC
    It seems pretty logical to me. The exists() itself call doesn't cause auto-vivification, but the first hash lookup to get the second hash that you're using exists() on does. I expect you can find a lot of discussion about if you dig into the old changes log around 5.00504 or thereabouts.
Re: Why does exists cause autovivication?
by chromatic (Archbishop) on Dec 29, 2007 at 03:17 UTC
    next LASTLOG unless exists $user_by_uid{$uid}->{$host};

    How is perl to know that you don't want to dereference $user_by_uid{$uid} when you dereference it explicitly? Where should exists start dereferencing, and how do you rewrite the evaluation order of expressions to make it work its way through the dereferencing chain without breaking the ability of exists to work on other expressions?

      "How is perl to know ..."

      Of course it knows, it is just the language's choice to do so, the whole autovivification thing can be avoided, for the good.

      It is not that perl does not know, the language designer made it to respond that way.

        If you want otherwise you know where to find it. Don't let the door hit you in the tuckus.

        (Anyone else getting a whiff of troll off this anonomonk?)

        The cake is a lie.
        The cake is a lie.
        The cake is a lie.

Re: Why does exists cause autovivication?
by Anonymous Monk on Dec 29, 2007 at 04:16 UTC

    Quite logic to me... becuase you peeked.

    Big picture...Autovivification is one of the worst bugged feature in perl. Let the language to guess the programmer's mind is never a good thing. To initialize a hash element is never and will ever be a big deal in any normal mind, thus there was never a need to be that much lazy - perl seemed to help when in fact to demage.

      Oh how ye discredit thyself! Auto-viv is much more useful than you make it sound and it only rarely causes problems, usually because someone hasn't managed to understand how it works.

      ---
      $world=~s/war/peace/g

        Thy careless writing, brother, doth bring pain unto mine eyes, for verily thou mixest the high and low modes of address, like unto a novice that mistaketh his sigils, as "@foo[0]" &c. Wherefore I commend unto thee that when thou wouldst chastise thy brethren so, thou shouldst say instead "How thou discreditest thyself", or "how ye discredit yourself".

        I think I agree with your point, though. Autovivification is one of those features that's both a strength and a weakness of Perl. A strength, because it's a very practical feature that makes life a bit easier for you and me; a weakness, because people who suffer from a surfeit of the wrong sort of laziness use it as an example of how Perl is "too hard".

      Let me see ... how many times have I spent time hunting a bug caused by autovivification during my ten years with Perl ... zero. How many times did it save me from code like if($data and $data->{foo} and $data->{foo}{bar} and $data->{foo}{bar}{baz} and $data->{foo}{bar}{baz}{bat}) or

      if (!exists($data->{$key}) { $data->{$key} = []; } push @{$data->{$key}}, $new_value;
      ? Countless.

      If you hear about autovivification for the first time it may sound scary, but you do get used to it. And the problems caused by autovivification are few and far apart.

Re: Why does exists cause autovivication?
by ikegami (Pope) on Dec 29, 2007 at 04:43 UTC

    exists doesn't autovivify.

    >perl -le "my %hash; exists($hash{foo}); print scalar %hash" 0

    Dereferencing (->) does.

    >perl -le "$hash->{foo}; print $hash ? 1 : 0" 1
      I was under the impression though that the dereference, autovivication, etc. happens within 'exists'. Assuming that is correct, at the first indication that autovivication is going to occur it could return false (and thus prevent the autovivication). Assuming the Perl 5 code base would easily allow for that.

      Or is my assumption about when the dereferencing and autovivication occurs wrong?

        "Within exists" implies it's done by exists. It's not. Perhaps a better term would be "within an exists context", although there's currently no such thing.

Re: Why does exists cause autovivication? (myth, mods)
by tye (Cardinal) on Dec 29, 2007 at 05:56 UTC

    exists has never had any special power over auto-vivification at all. See Re^10: searching a list (myth) for a full explanation of that.

    See Data::Diver for an alternate solution. Well-considered suggestions for improvements welcome; I'm quite sure there is ample room for improvement.

    Rather than "fix" exists to prevent auto-vivification, I'd prefer to have a pragma that makes auto-vivification a fatal error (or optionally a warning). There are times when I program very carefully and any case of autovivification is quite simply the indication of a bug. It is a rather like "use strict" or like the "Use of uninitialized value" warning. There are often times where the restrictions imposed by such things are inconvient (one-liners in the case of "use strict") but there are other times where enforcing "extra care" is a real boon.

    I also think such a pragma would be easier to implement than changing exists. Also, once you have the pragma, it makes it easier to give that power to exists as well -- since the pragma requires a flag for "don't auto-vivify" that can be attached to op-nodes (that is how lexical pragmata work) and "fixing" exists would require the same flag.

    - tye        

Re: Why does exists cause autovivication?
by graff (Chancellor) on Dec 29, 2007 at 07:07 UTC
    ... Perl certainly isn't doing what I mean... what's the rationale behind it?

    Think a little harder about what you are asking for here, and see if you can come up with a sensible rationale for that. Here's a hint:

    use strict; my %HoH; # case 1: { my $uid = 666; my $host = "foo.bar"; $HoH{$uid}{$host} = "hold onto this value"; } # what I meant there was: please autovivify $HoH{$uid} # and set its value to be an anon.hash ref # case 2: { my $uid = "something_unexpected_or_possibly_undef"; my $host = "who_cares_what_value_is_assigned_if_any"; if ( exists( $HoH{$uid}{$host} )) { do_something_appropriate(); } } # what I meant there was: please do not autovivify $HoH{$uid}; ins +tead, # do a separate check of hash-key existence for each level of hash + nesting
    The "exists()" function would have to be the only one that treats dereferencing syntax in this more complicated manner, otherwise perl would not be able to do what you mean in the simple assignment usage. I'm really not familiar with how "exists()" is currently implemented, but I have a hunch that the only way it could be given this special behavior would be to have special operations at compile time that would rewrite your simple expression for you, creating the multi-stage test for hash key existence, which as you already know would need to be done. (And it would have to do the right thing for all variations and depths of dereferencing syntax -- ooh! a source-code filter that creates a recursive function... that sounds like fun!)

    Would it be a good idea to make the "exists()" function actually work as a source-code filter at compile time? Food for thought (if you happen to like eating glass shards or sharp metal objects).

    As others have pointed out, the problem is not with the "exists()" function, but rather with the process of dereferencing a hash structure. If perl simply refused to autovivify upper/outer hash keys in an HoH(oH...) structure, there would be much less need for using "exists()" (and no need to make it "special" in its treatment of dereferencing). Would you like to take away the ability to autovivify "upper/outer" hash keys in assignment statements? (If that's your preference, I think a lot of perl users would disagree.)

    If you are having trouble with this repeatedly, you might want to make up your own version of "exists()" as a module -- but the calling syntax would have to be different... Maybe something like this?

    use strict; use warnings; my %HoH; if ( Exists( \%HoH, "foo", "bar", "baz" )) { warn "Something is very strange\n"; } print scalar keys %HoH, "\n"; $HoH{foo} = { bar => { baz => "okay" }}; if ( Exists( \%HoH, "foo", "bar", "baz" ) and !Exists( \%HoH, "oops", +"uhoh" )) { print "All is well\n"; } print scalar keys %HoH, "\n"; # the following sub would actually be in a module: sub Exists { my ( $href, $topkey, @subkeys ) = @_; return unless ( ref( $href ) eq 'HASH' ); my $result = 0; if ( exists( $$href{$topkey} )) { $result = ( ref( $$href{$topkey} ) eq 'HASH' and @subkeys ) ? + Exists( $$href{$topkey}, @subkeys ) : 1; } return $result; }
    (updated to fix a misplaced paren, and to add checks on the $href and @subkeys parameters)

    Others could probably write that to be more elegant/compact, but it does what you want to do without violence to a long-established convention.

      I think what I want is: Don't autovivify unless I'm actually performing an assignment or explicitly casting to a hash/array:
      $x[2]{z} = 1; # autovivify $x[2] as a hash $x[2]{z}; # don't autovivify $x[2] %{$x[2]}; # autovivify $x[2] as a hash func($x[2]{z}); # don't autovivify $x[2]

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://659432]
Approved by Old_Gray_Bear
Front-paged by redhotpenguin
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others meditating upon the Monastery: (4)
As of 2014-12-27 00:19 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    Is guessing a good strategy for surviving in the IT business?





    Results (176 votes), past polls