Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl: the Markov chain saw
 
PerlMonks  

"print" of nonexistent element is actually altering a hash

by larrymenard (Novice)
on Feb 17, 2020 at 18:11 UTC ( #11113064=perlquestion: print w/replies, xml ) Need Help??

larrymenard has asked for the wisdom of the Perl Monks concerning the following question:

Monks, your responses to others have been very helpful to me for many years. Now however it is time to post my own question.

I am creating a multi-dimensional hash and then printing a non-existent key in that hash. Curiously (at least to me), that "print" is actually altering the hash, adding an invalid (for lack of a better word) key.

#!/usr/bin/perl use strict; use Data::Dumper; my %hash; $hash{'key1'}{'key2'} = 'value'; print "\nDump of \%hash (1):\n"; print Dumper \%hash; # This print statement is actually altering the hash print "\n\"$hash{'key0'}{'key1'}{'key2'}\"\n"; print "\nDump of \%hash (2):\n"; print Dumper \%hash;
The result is:
Dump of %dtoHash (1): $VAR1 = { 'key1' => { 'key2' => 'value' } }; "" Dump of %dtoHash (2): $VAR1 = { 'key1' => { 'key2' => 'value' }, 'key0' => { 'key1' => {} } };

The "print" statement is the only thing that can possibly be altering the hash. Indeed, comment it out and the 2nd dump is normal.

I have reproduced this on multiple versions of perl 5, up to and including 5.26.3 (on CentOS 8).

Why is the "print" statement altering the hash?

Any explanation (or even better, advice on how to avoid it) would be much appreciated.

Thanks in advance.

Replies are listed 'Best First'.
Re: "print" of nonexistent element is actually altering a hash
by haukex (Chancellor) on Feb 17, 2020 at 18:19 UTC

    What you're seeing is the effect of "autovivification". It's not the print that's doing this, it's the hash access. $hash{'key0'}{'key1'} means you're asking Perl what's stored in the hashref at the key key0 in %hash, but since that doesn't exist, Perl infers from $hash{key0}{...} that you want $hash{key0} to be a hashref, so it creates it for you. The same thing happens a level deeper at $hash{'key0'}{'key1'}{'key2'} - note how it doesn't create key2 for you.

    If you want to avoid this in core Perl, then first note that autovivification, as a rule of thumb, happens in "lvalue" context, that is, in places where a value could be assigned to the hash. This is true in sometimes surprising contexts, such as for loops, because the loop variable is an alias to the values being looped over. If you want to avoid autovivification in such places, you need to use exists to check for the existence of hash keys before accessing them. An alternative is to use no autovivification from CPAN.

Re: "print" of nonexistent element is actually altering a hash
by hippo (Chancellor) on Feb 17, 2020 at 18:18 UTC
Re: "print" of nonexistent element is actually altering a hash (updated)
by AnomalousMonk (Bishop) on Feb 17, 2020 at 19:37 UTC

    Note also that in deeply nested hashes, exists will autovivify intermediate levels of the hash in the process of testing the existence of a low level element (update: but see this for further clarification).

    c:\@Work\Perl\monks>perl -wMstrict -MData::Dump -le "my %hash; print 'yes' if exists $hash{'www'}{'xxx'}{'yyy'}{'zzz'}; dd \%hash; " { www => { xxx => { yyy => {} } } }
    This effect is also avoided with  no autovivification; enabled.
    c:\@Work\Perl\monks>perl -wMstrict -MData::Dump -le "no autovivification; ;; my %hash; print 'yes' if exists $hash{'www'}{'xxx'}{'yyy'}{'zzz'}; dd \%hash; " {}

    Update 1: See also the recent discussion threads Sometimes undef is initialized and sometimes not when hash values are fed to grep and unexpected modify hash in a distance with grep { $_ } - both by the same monk!

    Update 2: To illustrate the behavior noted by haukex here, consider:

    c:\@Work\Perl\monks>perl -wMstrict -MData::Dump -le "my %h; ;; $h{'www'}{'xxx'}{'yyy'}{'zzz'}; dd 'access in void context', \%h; %h = (); dd 'assigning empty list really does clear hash', \%h; ;; my $x = $h{'www'}{'xxx'}{'yyy'}{'zzz'}; dd 'access in assignment (rvalue) context', \%h; %h = (); ;; 1 if $h{'www'}{'xxx'}{'yyy'}{'zzz'}; dd 'access in boolean context', \%h; %h = (); ;; 1 for $h{'www'}{'xxx'}{'yyy'}{'zzz'}; dd 'access in aliased (lvalue) context', \%h; " Useless use of hash element in void context at -e line 1. ("access in void context", { www => { xxx => { yyy => {} } } }) ("assigning empty list really does clear hash", {}) ( "access in assignment (rvalue) context", { www => { xxx => { yyy => {} } } }, ) ( "access in boolean context", { www => { xxx => { yyy => {} } } }, ) ( "access in aliased (lvalue) context", { www => { xxx => { yyy => { zzz => undef } } } }, )
    In every case except the for-loop (i.e., rvalue accesses), intermediate elements are created but not the lowest-level 'zzz' element. In the for-loop case in which aliasing creates an lvalue access context, the 'zzz' element is created.


    Give a man a fish:  <%-{-{-{-<

      A very good point!

      Note also that in deeply nested hashes, exists will autovivify intermediate levels of the hash in the process of testing the existence of a low level element.

      To be nitpicky, it's not exists, but the hash accesses preceding the exists call.

      Anyway, I just wanted to point out that the ugly-but-entirely-core way to avoid the autovivification in this example is:

      print 'yes' if exists $hash{www} && exists $hash{www}{xxx} && exists $hash{www}{xxx}{yyy} && exists $hash{www}{xxx}{yyy}{zzz};

      Although as I described in a recent thread, I try to keep my hash accesses fairly simple.

Re: "print" of nonexistent element is actually altering a hash
by larrymenard (Novice) on Feb 17, 2020 at 19:34 UTC
    Thank you Monks, the "exists()" is exactly what I needed to test whether the key exists or not.

      right, but be careful because AnomalousMonk warned that exists() will autovivify in some cases. Perhaps you want to run your program with and without no autovivification; and diff your resultant data structure.

      Hi
      use Data::Diver qw/ Dive /; print Dive( \%hash, qw/ key0 key1 key2 /); ## doesn't alter %hash
      Yes, as other "Anonymous Monks" have said a feature. Because, if you are legitimately trying to use something like $hash{'www'}{'xxx'}{'yyy'}{'zzz'}, it is "hugely convenient" that Perl will automagically create $hash{'www'} then $hash{'www'}{'xxx'} then $hash{'www'}{'xxx'}{'yyy'}, and maybe even $hash{'www'}{'xxx'}{'yyy'}{'zzz'}, all without asking. This very simple trick bypasses a lot of tedium and is "usually" beneficial. Just not in your case!

        Mike, you do realize you could save yourself the time and hassle of posting here and both the OP and PerlMonks would be just as well off. Don't you?

Re: "print" of nonexistent element is actually altering a hash
by Anonymous Monk on Feb 17, 2020 at 18:19 UTC

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://11113064]
Front-paged by Corion
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others studying the Monastery: (8)
As of 2020-03-30 11:02 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    To "Disagree to disagree" means to:









    Results (175 votes). Check out past polls.

    Notices?