Beefy Boxes and Bandwidth Generously Provided by pair Networks
No such thing as a small change
 
PerlMonks  

Use of wantarray Considered Harmful

by kyle (Abbot)
on Dec 12, 2008 at 16:09 UTC ( #729965=perlmeditation: print w/ replies, xml ) Need Help??

Using wantarray, a sub can tell what context it was called in so it can behave differently depending on whether its return value will be ignored, put into a scalar, or put into a list. An idiom I've seen a number of times uses wantarray to decide whether to return an array or an array reference.

sub get_x { my @x = qw( stuff ); return wantarray ? @x : \@x; }

This way, you can either write "@foo = get_x()" or "$foo = get_x()", and it will "just work".

I am against this practice.

Which way "just works"?

At $work right now, you can find both of these:

return wantarray ? @x : \@x;
return wantarray ? @x : $x[0];

I wonder what will happen when one of those two programmers has to work on the other's code.

I didn't think context mattered

...or I didn't know I changed the context.

Imagine refactoring this:

my $dewdrop_description = get_droplet(); $dewdrop_description->{substance} = 'water'; $dewdrop_description->{molecular_array} = get_moles( 'water' ); $dewdrop_description->{temperature} = 37;

We want to eliminate the repetition. Make it like this:

my $dewdrop_description = { %{ get_droplet() }, substance => 'water', molecular_array => get_moles( 'water' ), temperature => 37, };

Nice, right? Yes, except now it's broken because at the end of get_moles() we have:

return wantarray ? @moles : \@moles;

Our straight-forward refactoring has changed the context of get_moles() from scalar to list, and the result is that $dewdrop_description gets polluted with all manner of extra stuff. If you're lucky, get_moles() returns an even number of items, and you have warnings on, and you are told "Odd number of elements in anonymous hash". If you're not lucky, there's no warning, and an odd index element from get_moles() clobbers a key that get_droplet() returned.

Don't surprise me

I'm often looking at code already written for some example of how to use things. Given a "$scalar = x()", I expect that x() returns a scalar. I know very well that's not necessarily true and that context can have effects far beyond its source. Nevertheless, it generally does not occur to me that maybe this scalar-returning behavior is conditional.

A sub that returns more than one value can have unpredictable behavior in scalar context. What you get depends on whether the scalar context applies to an array or a list. If a sub returns a list, you can't just apply a scalar context to it and know what it will do without looking inside the sub to see what's to the right of return. In light of this, it almost seems merciful to use wantarray to lay down the law. A consistent use across code might actually clear some things up.

Still, I am not in favor of this.

Be consistent

Always return the same thing.

If your caller wants to stick your list in a scalar instead of an array, make it build the array reference itself.

If the list you return is so massive that copying it all is a burden, return an array reference. If the caller wants to copy it into an array anyway, it can dereference your return value and suffer the consequences.

On the defense

I've seen it suggested that every assignment from a sub call should be with a list context.

my ($scalar) = why();

This way, if it's giving you a scalar, you get it. If it's giving you a list, you get the first element. If it's giving you an array, you get the first element. It's about as consistent as you can get without looking into why().

I'm not quite to that point yet, but I can see where it's coming from.

A good use of wantarray

I think it's a good idea to use wantarray to check for void context. That can be used for run-time programming mistakes like calling an accessor in void context.

sub get_or_set { my $self = shift; if ( ! @_ && ! defined wantarray ) { die "accessor called in void context"; } $self->{stuff} = shift @_ if @_; return $self->{stuff}; }

This will catch the case where you don't know you've passed in nothing.

my @x = (); # but I thought it was qw( foo ) ! get_or_set( @x ); # set value

"Better" versions of wantarray

I'll note without comment that there are a couple of modules out there that do what wantarray does and more, if you're into that kind of thing.

Comment on Use of wantarray Considered Harmful
Select or Download Code
Re: Use of wantarray Considered Harmful (Mason)
by jeffa (Chancellor) on Dec 12, 2008 at 16:40 UTC
    I recently thought wantarray would allow me to create a useful Mason component that would either return data in an array so I could format it element by element -- or return a string of that data pre-formatted. Not a good idea. In fact, Mason components already have a dual nature about their return values -- so I think that adding a decision split in the return value just compounds the problem even more. Here is the Mason code:
    <%init> my @foo = $m->comp( '.foo' ); my $foo = $m->comp( '.foo' ); </%init> <%def .foo> <%init> my @array = qw(foo bar baz); return wantarray ? @array : join( '', map "$_<br/>", @array ); </%init> </%def>
Re: Use of wantarray Considered Harmful
by Tanktalus (Canon) on Dec 12, 2008 at 17:29 UTC

    On the other hand, perl's basic nature is one of context. You get different things based on different context, just talking about core functions:

    $x{time} = gmtime; #vs %x = ( time => gmtime );
    It's no different. And you aren't about to get P5P to change that behaviour.

    Now, I'm not saying "Abandon hope all ye who enter here". It's more like it's your choice: take it as a negative to be purged, or accept it as idiomatic and a quirk of DWIMmery, and understand that context always has mattered, always will, and to keep in mind what context you're in at all times. It's really up to you how you want to take this. Personally, though I find it frustrating whenever I do forget the context, I find it far more liberating the rest of the time when the code just Does What I Mean.

    Now, as to whether it's better to have wantarray ? @x : \@x vs wantarray ? @x : $x[0], that really depends. You need to pay attention to the name of the function, as well as its normal usage. Is it normal that the caller will be aware that in certain situations only a single element will be returned, even if the callee isn't going to be able to determine that until it gets there? If so, return $x[0]. As an example, XML parsers. Though the XML parser can never be sure that an element won't be repeated, it's entirely likely that the caller knows that there can only be a single element matching a given xpath or whatever, due to an intimate knowledge of the XML and/or DTD.

    Of course, it still leaves on the user a duty to look at the docs before calling the API. But that's not any different than any other API in any other language.

    In other words, leave me my flexibility, thanks. :-P

Re: Use of wantarray Considered Harmful
by JavaFan (Canon) on Dec 12, 2008 at 18:08 UTC
    Many of Perls builtins return different things depending on context. It's always considered a good thing if user subs can do the same thing as builtins. Perl isn't quite there yet (mainly having to do with prototypes), but the ability to return different things depending on context goes a long way.

    If you don't like user subs returning different things depending on context, what is your feeling about builtins? If you don't like the fact builtins have different behaviour depending on context, why do you program in Perl? But if you don't mind, why do you mind if user subs do?

      If you don't like user subs returning different things depending on context, what is your feeling about buildins? If you don't like the fact buildins have different behaviour depending on context, why do you program in Perl? But if you don't mind, why do you mind if user subs do?

      Good questions.

      I think that built-ins can have different rules basically because they are built-ins, they are well documented, and they are finite. I have not memorized all the myriad ways that they change their behavior (see On Scalar Context), but I certainly know all the ones I've encountered with any frequency, and a lot of them do what I expected to begin with.

      Adapting to the behavior of Perl built-ins is easy and well worth it. There aren't that many strange behaviors, they're all well documented, and I use them over and over.

      User code, on the other hand, is mostly unknown, often poorly documented (if at all), far more vast, and far less useful. I don't want to have to be suspicious of every one of every sub I might encounter, and since the code I work on is not documented very well, I rely on examples from the code to tell me how something is to be used. There's more of it that I have to check, and the payoff is not very great because I won't spend my whole career working with it.

Re: Use of wantarray Considered Harmful (bad use)
by shmem (Canon) on Dec 12, 2008 at 19:03 UTC
    Which way "just works"?

    At $work right now, you can find both of these:

    return wantarray ? @x : \@x; return wantarray ? @x : $x[0];

    I wonder what will happen when one of those two programmers has to work on the other's code.

    Both are just bad practice.

    That's not something wantarray is to blame for. Both obscure the intent of the calling code and place the usage intention of the function return to that function, out of the wrong type of laziness.

    It is so much clearer to make the intent clear in the calling code,

    my $ref = [ $foo -> bar( $stuff) ]; my @ary = $foo -> bar( $stuff); my $baz = ( $foo -> bar( $stuff) )[0];

    which usage doesn't even need comments.

    If the code of the method bar() produces a list, it should return a list and leave it to the caller to handle the result: stuff it into a reference, an array, place the first element in a scalar or count the list's elements. Things get interestingly different if the semantics and/or return values of the function in question are non-trivially different in scalar, list or void context, for purposes made clear in the calling code.

    But then,

    I didn't think context mattered
    ...or I didn't know I changed the context.

    context is a basic perl principle and built into the language, for good reasons, hence there's wantarray as a built-in function which is absolutely needed. Bad usage is no reason to deprecate it.

    Core perl functions behave different when called in scalar or list context, and such behavior is documented. Your code examples just show the lack of documented conventional coding standards at your working place, which wantarray isn't to be blamed for, either. If you have dual-use functions or methods, you have them documented, and are aware of the context of the calling code.

    wantarray has its good uses - Contextual::Return relies on it.

      I agree that wantarray has good uses, but I rarely see them. I might even go so far as to say the good uses of wantarray are as rare as the good uses of prototypes. I'm curious as to what you'd consider a good use of it. I don't consider Contextual::Return to be a "good use" mostly because it does the same thing only moreso. It's like saying a good use of open is IPC::Open3.

        Well, as I wrote some time ago, one such use are subroutines that just produce data and behave different in void, scalar and list context

        $thingy -> foo (1,2); # results printed to current filehandl +e $string = $thingy -> foo (1,2); # results returned as a single string @lines = $thingy -> foo (1,2); # results returned as a bunch of lines

        which works fine and is just fine - if documented.

        Another such use would be subroutines which behave like google's "I feel lucky" (scalar) vs. "normal" (list) search.

        I agree with you that good uses of wantarray are rare, and the only type of prototype I use now and then is '&', which lets me write subs that get passed blocks, like map or sort.

        I don't consider Contextual::Return to be a "good use" mostly because it does the same thing only moreso.

        That statement contradicts the OP's last heading - "Better" versions of wantarray.

      It is so much clearer to make the intent clear in the calling code,
      my $ref = [ $foo -> bar( $stuff) ];

      And yet, it is so profligate. This way of obtain a reference to a million item array built within a sub:

      C:\test>p1 sub x1{ my @a; $a[ $_ ] = $_ for 1 .. 1e6; return @a };; print time; $r = [ x1() ]; print time;; 1229112810.36413 1229112815.5985

      Takes 5 seconds and uses 50 MB of RAM.

      This way:

      C:\test>p1 sub x2{ my @a; $a[ $_ ] = $_ for 1 .. 1e6; return \@a };; print time; $s = x2(); print time;; 1229112742.05163 1229112742.43096

      Takes under half a second and uses 20MB of RAM.

      If every time someone asked google to give them a reference to wikipedia, they transmitted the whole 4.4GB, people would rightly find the wasting of scarse resources distasteful.

      If you going to go the "context is bad" route, always return the reference and let the caller expand it if they need to.


      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.
        And yet, it is so profligate.

        Of course, since it is copying all the stuff around. What about

        sub x3{ 1 .. 1e6 } print time; $r = [ x3() ]; print time;

        this?

        But of course that wasn't my point - I'm rather saying that it is a bad idea to hide the decision of how to deal with returns in a place far away. And no, I'm not going the "context is bad" route; but using context to obfuscate things should be used in obfuscation type of code, only...

      Hear, hear. You make many good points.

      return wantarray ? @x : \@x; return wantarray ? @x : $x[0];

      I wonder what will happen when one of those two programmers has to work on the other's code.

      Both are just bad practice.

      The first one is just plain bad. Someone would have to work hard to persuade me that it's a good idea to use it. It costs functionality and doesn't buy anything. If there is a strong need to return an array ref, then why not return the reference every time. If one "needs" a version of the routine that returns a list, wrap the reference-returning version and and give the wrapper a clear name.

      I don't think the second usage is necessarily bad practice. This is the same interface used by readline. If there is a good reason not to return the whole list at once (it's huge, your reads are destructive, etc), then this API makes sense.

      Context sensitive APIs are like operator overloading and tie. If used properly, they improve the readability of your code. If used badly, life begins to suck. Think carefully before using any of these techniques.


      TGI says moo

        I don't think the second usage is necessarily bad practice. This is the same interface used by readline.

        Oh no it isn't! (Cue audience...)

        readline returns line 2 on the second call, line 3 on the third call, etc -- and it only reads those lines when they're requested. This is useful, because you can stick it in a while loop and iterate over a potentially large "array" but only hold one element in memory at a time.

        return wantarray ? @x : $x[0]; does nothing so useful. It creates the entire array every single time the sub is called, and then it always returns the same first element. If the whole list is huge, then you're creating the whole huge list anyway. If the reads are destructive, then you just destroyed everything but the first item. Not so useful...

Re: Use of wantarray Considered Harmful
by ikegami (Pope) on Dec 12, 2008 at 20:45 UTC

    I'm undecided on the issue. Consider the code I've seen on PerlMonks this week:

    $field = $sth->fetchrow_array();

    The result of fetchrow_array in scalar context is undefined or at least undocumenteddocumented as undefined, at least some of the time. What should fetch_array do in this situation? Let the result be undefined (GIGO), or provide something sensible? The latter requires wantarray (or at least a commitment to using a compatible op).

    Given a "$scalar = x()", I expect that x() returns a scalar. I know very well that's not necessarily true

    It is necessarily true. x() cannot return anything but a scalar in scalar context.

    In scalar context
    sub x { ...; @x }@x evaluates to the num of elements in @x, so
    x() returns the num of elements in @x
    sub x { ...; ($x,$y) }($x,$y) evaluates to $y, so
    x() returns $y
    sub x { ...; () }() evaluates to undef, so
    x() returns undef

      $field = $sth->fetchrow_array();

      As you say, the result is not documented. Some would say this is just a bug, like using length to get the length of an array.

      Something sensible in this case could use wantarray, as in:

      sub fetchrow_array { my @out; # ... if ( wantarray ) { # list return @out; } elsif ( ! defined wantarray ) { # void return; } elsif ( scalar @out <= 1 ) { # scalar return $out[0]; } else { # scalar die 'too many return values for scalar'; } }

      That seems safe and yet still not quite satisfactory. I'm not sure I'd want to have some code that worked suddenly fail because one valid query (returning one field per record) changed to another valid query (returning many fields per record).

      Given a "$scalar = x()", I expect that x() returns a scalar. I know very well that's not necessarily true

      It is necessarily true. x() cannot return anything but a scalar in scalar context.

      I should have said something like, "I know very well it might return more than one value in some other context." My point was that, seeing "$scalar = x()" implies to me that x() only returns one value, ever, in any context. I would ordinarily think that "@array = x()" would give me an array with one element. Using wantarray allows x() to violate that expectation, often for no good reason.

      I think a case can be made for context-sensitivity in DBI::fetchrow_array, and probably lots of other places. I'll say again, however, I think those cases are still very rare.

        I think I disagree with your suggestion that $scalar = x() should imply anything about how that function behaves in list context. You get the same issue without wantarray. For example,

        my @arr = qw/1 2 3 4 5 6 7 8 9/; sub x { return @arr; }

        Now, $scalar = x() returns the length of the array. @list = x() returns the actual array. There's no wantarray in sight, but we still get different behavior in different contexts.

        Granted, wantarray can be used to generate hard-to-understand software. Then again, so can almost any feature of every language out there.

        G. Wade

        Some would say this is just a bug, like using length to get the length of an array.

        While it is a bug (if only a documentation bug), I don't see the parallel or your point. The behaviour of length(@a) is documented, and the result is incorrect for all but arrays with one element. The behaviour of $field = $sth->fetchrow() isn't documented and it actually worked (if I remember correctly).

        I think a case can be made for context-sensitivity in DBI::fetchrow_array

        Really? What argument can be made for fetchrow_array that can't be made for other methods in general? In fact, given the name of the function, fetchrow_array, it seems to me to be the least likely of candidates.

        It's documented to be undefined, so I'm covered, but thanks for the clarification. I thought I had checked and found nothing. I fixed the post to which you replied.
Re: Use of wantarray Considered Harmful (consider)
by tye (Cardinal) on Dec 12, 2008 at 22:41 UTC

    If you want to "outlaw" wantarray then this just leads to having to outlaw returning anything but a scalar.

    return @x; # same as return wantarray ? @x : 0+@x; return( ..., $y ); # same as return wantarray ? ( ..., $y ) : $y;

    Well, actually, there is one alternative to that (well, it uses wantarray but I think you'll agree that it honors the spirit of your desire to highly discourage the use of wantarray):

    croak( "This only makes sense in a list context!" ) if ! wantarray; # ... return @list;

    Note that I am not saying that it is crazy to want to discourage the use of wantarray (in spirit). One can make some good points about why it can be a good idea to just always return a reference to an array instead of returning a list of values. And one can make some good points about making scalar context fatal when you have a function that really wants to return a list.

    Though I also don't find it hard to imagine cases where always returning a reference to an array would be annoying. Returning a list can often be convenient. Yes, the only difference is the need to add @{ ... }, but those three characters can get pretty obnoxious if you have to sprinkle them all over your code.

    So I can certainly imagine cases where you want to "normally" return a convinient list of values but you also want to allow for the occasional usage of getting a reference (perhaps because only occasionally the list is so large that the inefficiency of copying some big list actually matters). Yes, that leads to some specific opportunities for specific types of mistakes and thus has some disadvantages. But I'm not convinced that those disadvantages always trump the potential advantages. I think it certainly calls for some caution, but the amount of caution feels more like a style choice than something deserving of the "Considered Harmful" label.

    Similarly, I can easily imagine cases where you very often just want the one simple item of data but you also want to allow for getting much more data. This is how a ton of built-ins work, for example, caller. And I don't see the trade-offs here as obviously always winning in one direction or the other and certainly not one side winning to the point of the other side being "Considered Harmful".

    My style choice is to pay attention to the cases where an expression that usually gives a simple scalar would lead to errors if it instead returned a list of not-exactly-one items and to pay attention to interfaces that confusingly jump between returning a scalar or not (such as CGI's param() method and some regex constructs).

    Most context-aware interfaces in Perl are a net win, in my experience. But it is a good point that one should consider the potential for confusion when implementing such and have some confidence that it is worth it.

    As for the several recommendations of Contextual::Return, I have to disagree. That module appears to have been implemented in a way that suffers significantly from ETOOMUCHMAGIC and I happily avoid it and encourage others to as well.

    - tye        

Re: Use of wantarray Considered Harmful
by dragonchild (Archbishop) on Dec 13, 2008 at 17:33 UTC
    First off, Contextual::Return is a great idea and only makes the problem you're describing worse. I've tried it several times and cannot do anything but ignore it.

    Second, there is really only one good use of wantarray - to return an array or an iterator. Array vs. arrayref is tolerable, but only barely. Array vs. first element is completely useless. The key here is that even though different context is used, the same thing is being returned. If you do array vs. first element, you're changing the semantics of the function - it does something different. Arrays vs. iterators are the same thing. Arrays vs. arrayrefs suck because of Perl's syntactic differentiation between arrays/hashes and references to them and the difficulty in determining scalar type. But, given that "scalar" is a type and the various reftype() values are more subtypes than first-class types, this isn't surprising. This, actually, is one of the things I'm looking forward to Perl6 the most - actual proper typing.


    My criteria for good software:
    1. Does it work?
    2. Can someone else come in, make a change, and be reasonably certain no bugs were introduced?
        Whether or not caller and each behave sanely is a separate question. Just because it's in core doesn't mean it's a good idea. Frankly, I think that a subroutine should return one and only one type of entity. It shouldn't return a collection (array, in this case) in one situation and an element (first element, in this case) in another. That leads to serious hard-to-debug issues. This isn't theoretical - it's personal experience speaking.

        My criteria for good software:
        1. Does it work?
        2. Can someone else come in, make a change, and be reasonably certain no bugs were introduced?
Re: Use of wantarray Considered Harmful (Perl 6)
by moritz (Cardinal) on Dec 15, 2008 at 09:07 UTC
    If you think that wantarray is harmful, you might be pleased to know that in Perl 6 there is a paradigm shift: instead of returning things depending on context, the trend goes to returning an object which will behave appropriately in ever context.

    For example a regex match returns a Match object, which returns the success status in boolean context, the matched string in string context, a list of positional captures in list context, and if used as a hash, it provides access to all named captures.

    Also note that want, the Perl 6 generalization of wantarray, can't always work due to multi dispatch:

    multi a(Str $x) { say "got a string" } mutli a(@a) { say "got an array" } sub b { # what's the context here? } a(b());

    Since the dispatcher only knows which a sub it will dispatch to after seeing the return value from b(), it can't provide an unambiguous context to a.

    That said, even if we are aware of the problem above and neglect that paradigm shift, return @list; isn't a problem in Perl 6, because there are more specific contexts. In generic item context ("scalar context") it will return an Array object, and only in numeric context will the caller see the number of items in the array.

Re: Use of wantarray Considered Harmful
by tilly (Archbishop) on Dec 20, 2008 at 03:42 UTC
    Hear, hear. A general programming principle is that interfaces should be simple if possible. Context makes all interfaces more complex. Furthermore I find that my decisions about context are often the part of my interface that ages least well.

    For those who argue about built-ins, there is the concept of amortized complexity. You learn a built-in then get to use it day after day for a long time. When you amortize the cost of learning it over the use you get, a lot of complexity can be justified for minor convenience. But little user code gets amortized that much, making it harder to justify any complexity there.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlmeditation [id://729965]
Approved by talexb
Front-paged by Arunbear
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others pondering the Monastery: (6)
As of 2014-09-21 22:47 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    How do you remember the number of days in each month?











    Results (176 votes), past polls