Beefy Boxes and Bandwidth Generously Provided by pair Networks
Don't ask to ask, just ask
 
PerlMonks  

Preventing unintended list expansion inside hash literals.

by gregory-nisbet (Novice)
on Jan 05, 2017 at 07:52 UTC ( #1178987=perlquestion: print w/replies, xml ) Need Help??

gregory-nisbet has asked for the wisdom of the Perl Monks concerning the following question:

=> and , both impose list context on their arguments in the body of hash table since it's an ordinary list expression. This can have unintended consequences in cases like this.

Suppose we have a function that takes an arbitrary key from a hash reference. However, when the hash table is empty, we just call return. Most of the time we're calling this function in scalar context, so we're fine.

sub some_key { my $hash_ref = shift; %$hash_ref or return; my ($key, undef) = each %$hash_ref; keys %$hash_ref; # reset iterator $key; }

However, one day, we decide to use it to construct a value in a hash table (or key, argument is the same).

my %some_hash = ( key1 => some_key($bar), key2 => some_key($baz), );

If $bar and $baz are both empty, then we now have %some_hash = (key1 => 'key2');, which almost certainly wasn't intended.

It's clear what happened. Most of the time some_key returns a list with one element, and we treated it as if it always does.

In my own code I avoid explicitly creating keys or values from the return values of subroutines and use an intermediate scalar variable instead, unless my expression is guaranteed to be a single-element list in list context like $array_ref->[0], [function()], or $hash{$key}. Note though that (...)[0] is a zero-element list in list context if the list is empty.

What's the idiomatic way to guard against unintentionally expanding a non-singleton list into the bodies of your hashes? I haven't seen anything about it specifically in style guides so far.

Replies are listed 'Best First'.
Re: Preventing unintended list expansion inside hash literals.
by kcott (Archbishop) on Jan 05, 2017 at 09:38 UTC

    G'day gregory-nisbet,

    "What's the idiomatic way to guard against unintentionally expanding a non-singleton list into the bodies of your hashes?"

    I don't know if there's an idiom for that specific case; however, as a general case, you can use scalar to force scalar context. So, if you're happy to have undef values in your hash:

    key1 => scalar(some_key($bar))

    If you don't want undef values and are using perl v5.10, or higher, you can do something like these:

    key1 => (some_key($bar) // 0) key1 => (some_key($bar) // [])

    If none of those suit, then what you're currently doing ("use an intermediate scalar variable") seems like a perfectly fine solution.

    — Ken

      Hello kcott,

      I considered using scalar too, but then the comma operator returns the right-most value in the list, which doesn’t seem to be what the OP is looking for:

      19:53 >perl -MData::Dump -wE "sub f { return (17, 42); } my %h = ( key +1 => scalar f() ); dd \%h;" { key1 => 42 } 19:54 >

      Cheers,

      Athanasius <°(((><contra mundum Iustus alius egestas vitae, eros Piratica,

        I was working on this from the OP:

        "However, when the hash table is empty, we just call return."

        That appears to be confirmed in his code example:

        ... %$hash_ref or return; # just return ... $key; # or last line in sub returns a scalar

        So, %some_hash is ending up as one of:

        ( key1 => ($val1), key2 => ($val2) ) ( key1 => (), key2 => () ) ( key1 => ($val1), key2 => () ) ( key1 => (), key2 => ($val2) )

        These will be flattened to:

        ( 'key1', $val1, 'key2', $val2 ) ( 'key1', 'key2' ) ( 'key1', $val1, 'key2' ) ( 'key1', 'key2', $val2 )

        The first is fine; the second is what the OP had a problem with; the last two are problematic but, assuming he's using warnings, should generate an "Odd number of elements in hash assignment" diagnostic (from perldiag).

        I don't think the "comma operator returns the right-most value in the list" comes into play in any of these scenarios: all lists are either empty or only have one value.

        Am I missing something with your "return (17, 42);" example?

        — Ken

Re: Preventing unintended list expansion inside hash literals.
by Athanasius (Archbishop) on Jan 05, 2017 at 08:08 UTC

    Hello gregory-nisbet,

    IIUC, a straightforward way to get the desired behaviour is to change this line:

    %$hash_ref or return;

    — which, if %$hash_ref is undefined may return either undef or () (the empty list), depending on whether the subroutine is called in scalar or list context — to this:

    %$hash_ref or return undef;

    BTW, I get the same results with this line:

    keys %$hash_ref; # reset iterator

    commented-out. Why do you think it’s useful here?

    Anyway, hope that helps,

    Update 1: It seems I missed the point of your question, which is “What's the idiomatic way to guard against unintentionally expanding a non-singleton list into the bodies of your hashes?” I think your own solution, to “use an intermediate scalar variable” is probably the best way.

    Update 2: Alternatively, you could combine your suggestion of using the first element with a test for definedness:

    18:30 >perl -MData::Dump -wE "sub f { return (17, 42); } my %h = ( key +1 => (f())[0] // undef ); dd \%h;" { key1 => 17 } 18:30 >perl -MData::Dump -wE "sub f { return (); } my %h = ( key +1 => (f())[0] // undef ); dd \%h;" { key1 => undef } 18:30 >

    Athanasius <°(((><contra mundum Iustus alius egestas vitae, eros Piratica,

Re: Preventing unintended list expansion inside hash literals.
by vr (Curate) on Jan 05, 2017 at 09:47 UTC

    %$hash_ref or return; line is redundant, it doesn't guard from not a hashref being passed to sub, and each returns undef in scalar context on empty hash, anyway.

    If sub is guaranteed a hashref as an argument, then iterator could be reset BEFORE using each, and, not sure if "idiomatic", but definitely shorter sub can be

    sub some_key { keys %{$_[0]} ? scalar each %{$_[0]} : undef }

    p.s. Shorter:

    sub some_key { keys %{$_[0]}; scalar each %{$_[0]} }
Re: Preventing unintended list expansion inside hash literals.
by dsheroh (Monsignor) on Jan 05, 2017 at 10:00 UTC
    If you want to force it to be a scalar, you can use the scalar keyword instead of a throwaway variable:
    $ perl -MData::Printer -E 'sub some_sub { return 1 } ; %h = ( foo => s +ome_sub(), bar => 42 ); p %h' { bar 42, foo 1 } $ perl -MData::Printer -E 'sub some_sub { return } ; %h = ( foo => som +e_sub(), bar => 42 ); p %h' { 42 undef, foo "bar" } $ perl -MData::Printer -E 'sub some_sub { return } ; %h = ( foo => sca +lar some_sub(), bar => 42 ); p %h' { bar 42, foo undef }
Re: Preventing unintended list expansion inside hash literals. (return;)
by tye (Sage) on Jan 07, 2017 at 19:11 UTC

    Yep, you should change return; to return undef;. It is a very bad idea to use return; in a sub that otherwise only ever returns a single scalar.

    - tye        

Re: Preventing unintended list expansion inside hash literals.
by LanX (Sage) on Jan 05, 2017 at 19:24 UTC

    Unfortunately fat comma is only going half way to be a hash separator (which is the normal use case) and there are already many use cases exploiting list context for syntactic sugar.

    See for instance Re^4: Stop Using Perl and further discussion

    Anyway some remarks:

    • you should have returned undef in your function, since it was never intended to be used in list context and a call in scalar context would always result to undef anyway.
    • you could have checked wantarray otherwise
    • prepending scalar is the normal idiomatic way to force scalar context (sic)
    • personally I would have chosen "" . call() to get am explicit string
    • your use of each and keys is a bit worrying, because it's globally resetting the iterator of that hash

    Cheers Rolf
    (addicted to the Perl Programming Language and ☆☆☆☆ :)
    Je suis Charlie!

    update

    ) a simple ($key) = %$hash_ref would have done the trick better

      Interesting. I certainly agree with

      a simple ($key) = %$hash_ref would have done the trick better

      for clarity, terseness and, more important, lack of side effects. Unless a hash is huge:

      >perl -E "$h{$_} = $_ for 0 .. 5_000_000; say time - $^T; keys %h; say + time - $^T; $k = scalar each %h; say time - $^T; say $k" 10 10 10 852569 >perl -E "$h{$_} = $_ for 0 .. 5_000_000; say time - $^T; ($k) = %h; s +ay time - $^T; say $k" 10 176 4034296

      with equally bad effects on memory. Looks like keys in void context is harmless (performance-wise), but assignment of a hash to single-element list is not

        I did some more meditation and research...

        First of all I have to say, that I can't possibly see why one would want to store a "representative key" of a hash in another structure (like the OP did)

        Unless you want only to check if the hash is populated, then

        DB<129> scalar %empty => 0

        will do the trick without side effects ( scalar keys %empty would reset the iterator).

        Also the usual trick to copy a hash before iterating (in order to protect the iterator) is far slower than assigning to a one element list (which can't compete with each)

        DB<100> $h{$_}=$_ for 1..5e6 DB<101> $t=time; %h2=%h; print time-$t; 124 DB<102> $t=time; ($k)=%h; print time-$t; 19 DB<103> use Time::HiRes qw(time) DB<104> $t=time; ($k)=each %h; print time-$t; 7.70092010498047e-05

        BUT I have to say I'm very critical about hashes of that size, because memory consumption can easily freeze your system.

        Depending on use case I'd either consider using a database or a hash of hashes (of hashes ...) which is far more "swap friendly".

        Cheers Rolf
        (addicted to the Perl Programming Language and ☆☆☆☆ :)
        Je suis Charlie!

        ) even emptying a hash with 5e6 entries can take minutes on a netbook.

      bald
Re: Preventing unintended list expansion inside hash literals.
by Anonymous Monk on Jan 05, 2017 at 15:27 UTC
    my %some_hash; $some_hash{key1} = some_key($bar); $some_hash{key2} = some_key($baz);

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://1178987]
Approved by Athanasius
Front-paged by haukex
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others taking refuge in the Monastery: (3)
As of 2022-09-25 06:10 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    I prefer my indexes to start at:




    Results (116 votes). Check out past polls.

    Notices?