Beefy Boxes and Bandwidth Generously Provided by pair Networks
There's more than one way to do things
 
PerlMonks  

Why doesn't this die with "Can't use an undefined value as an ARRAY reference"?"

by kikuchiyo (Hermit)
on Oct 18, 2017 at 17:49 UTC ( [id://1201610]=perlquestion: print w/replies, xml ) Need Help??

kikuchiyo has asked for the wisdom of the Perl Monks concerning the following question:

Consider the following script:

#!/usr/bin/perl use strict; use warnings; use Test::More; use Data::Dumper; my $hash = { '50' => [ 1 ] }; print Dumper $hash; is(keys %{$hash}, 1, q/keys %{$hash} is 1/); is(scalar @{$hash->{'50'}}, 1, q/$hash->{'50'} is 1/); is(scalar @{$hash->{'100'}}, 0, q/$hash->{'100'} is 0/); print Dumper $hash; done_testing();

With Perl 5.24.3 it runs to the end and all tests pass, even though I would expect that it dies with an "Can't use an undefined value as an ARRAY reference" error when it tries to dereference $hash->{'100'} which indeed does not exist.

Compare with

#!/usr/bin/perl use strict; use warnings; my $hash = { '50' => [ 1 ] }; print scalar @{$hash->{'100'}};

which dies with the expected error.

Under Perl 5.16 the first program also dies with the expected error. (This is how we initially noticed the problem: a program that was developed on 5.22+ needed to be ported to Centos 7 which has 5.16, and the tests began to fail there.)

What is going on here?

(Errata: Now I've ran with more Perl versions (perversions), and it doesn't die under perl 5.22 and above, but dies as expected under perl 5.20 and below)

Replies are listed 'Best First'.
Re: Why doesn't this die with "Can't use an undefined value as an ARRAY reference"?"
by haukex (Archbishop) on Oct 18, 2017 at 18:57 UTC

    Interesting... a minimal test case is the following, which throws the "Can't use an undefined value as an ARRAY reference" error in all Perl releases from 5.6 to 5.20, but doesn't cause an error in Perl 5.22 thru 5.26.

    $ perl -wMstrict -le 'sub x{} my $h={50=>[1]}; x(scalar @{$h->{100}})'

    It apparently has something to do with that specific call, since the following two always fail:

    $ perl -wMstrict -le 'sub x{} my $h={50=>[1]}; x(0+@{$h->{100}})' Can't use an undefined value as an ARRAY reference at -e line 1. $ perl -wMstrict -le 'my $h={50=>[1]}; print(scalar @{$h->{100}})' Can't use an undefined value as an ARRAY reference at -e line 1.

    A bisect boils this down to commit 569ddb4a: "scalar($#foo) needs to propagate lvalue context" and perl5220delta says: "scalar() now propagates lvalue context, so that for(scalar($#foo)) { ... } can modify $#foo through $_."

    The following two don't cause any failures on any release from 5.6 to 5.26, and in both cases cause the hash entry 100=>[] to autovivify:

    $ perl -wMstrict -le 'sub x {} my $h={50=>[1]}; x(@{$h->{100}})' $ perl -wMstrict -le 'sub x ($) {} my $h={50=>[1]}; x(@{$h->{100}})'

    So the change to scalar in 5.22 is simply passing the autovivification behavior through.

      Thanks for the investigation!

      Some additional observations:

      no autovivification qw/store/; causes the offending part (of your minimal example) to die with "Can't vivify reference at -e line 1.".

      Reading the autovivification module's documentation gave me a hint: lvalue context. Searching that led me to a Perlmonks article from 12 years ago: Autovivification of scalars in sub calls Those who don't know history are doomed to repeat it, apparently.

      The key phrase seems to be "arguments to subs are lvalues" and apparently this is a feature. Still, I find it surprising and confusing.

      The linked thread mentions that "incidentally, builtin functions do not provide a similar service" (i.e. autovivification for their arguments) - this is not unconditionally true:

      #!/usr/bin/perl use strict; use warnings; use Data::Dumper; my $h={}; for my $x ("pop", "shift", "map {1}", "grep {1}", "chomp", "lc", "loca +ltime", "cos") { my $str = $x.q/ @{$h->{'/ . $x . qq/'}}\n/; print $str; eval $str; print "\t".$@ if $@; } print Dumper $h;
      ...results in:
      pop @{$h->{'pop'}} shift @{$h->{'shift'}} map {1} @{$h->{'map {1}'}} grep {1} @{$h->{'grep {1}'}} chomp @{$h->{'chomp'}} lc @{$h->{'lc'}} Can't use an undefined value as an ARRAY reference at (eval 7) lin +e 1. localtime @{$h->{'localtime'}} Can't use an undefined value as an ARRAY reference at (eval 8) lin +e 1. cos @{$h->{'cos'}} Can't use an undefined value as an ARRAY reference at (eval 9) lin +e 1. $VAR1 = { 'chomp' => [], 'pop' => [], 'map {1}' => [], 'grep {1}' => [], 'shift' => [] };

        Indeed, there does seem to be some apparent inconsistency there. The only thing I see about your examples that might be considered "consistent" is that those functions that take a scalar argument are the ones throwing an error - although I haven't yet expanded my list to more functions to see if this holds true elsewhere.

        $ perl -le 'printf "%10s %s\n",$_,prototype("CORE::$_")//"undef" for qw/pop shift map grep chomp lc localtime cos/' pop ;\@ shift ;\@ map undef grep undef chomp undef lc _ localtime ;$ cos _

        As for subroutine arguments, the elements of @_ being aliases to the actual parameters is documented and useful, but also one of those things that bites many people.

        In any case, this remains an interesting issue, thanks for continuing the investigation.

      One more thing I've just noticed:

      The perldelta fragment mentions for(scalar($#foo)) { ... }. But what does this even mean? Why would anybody do this? $#foo is already a scalar (the index of the last element of @foo), why would anybody call scalar on it, and why would anybody use it in a foreach?

      If you want to enlarge or shrink the array via $#foo, you can do it simply by $#foo = 43;. This foreach nonsense is just obfuscation and I don't see why was it "corrected".

        But what does this even mean? Why would anybody do this?

        I would guess that it is maybe just a simplification of a different case; using for as a topicalizer can be pretty useful. Clicking through the commits seems to show that the change to scalar was a bugfix resulting from the discussion in #24346. I found that an enlightening point was: "Since scalar is just a directive to change context, I don't see why it should change *anything* else."

        In general, I think what this boils down to is a problem caused by autovivification, which, while of course powerful and nice, is also sometimes the source for such confusion. There is of course no autovivification; from CPAN, but personally I like to just be explicit and use exists and friends to avoid things vivifying when I don't want them to.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://1201610]
Front-paged by Corion
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others goofing around in the Monastery: (6)
As of 2024-03-19 11:28 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found