Beefy Boxes and Bandwidth Generously Provided by pair Networks
Think about Loose Coupling
 
PerlMonks  

Trouble getting size of list returned from sub

by wanna_code_perl (Pilgrim)
on Nov 26, 2012 at 08:53 UTC ( #1005594=perlquestion: print w/ replies, xml ) Need Help??
wanna_code_perl has asked for the wisdom of the Perl Monks concerning the following question:

Dear monks,

I'll confess to feeling rather foolish right now. I have to work with a sub that is returning a list directly, and I don't need the contents of the list; just the number of elements, but I can't figure out how to do so without a temp variable. Here's a simplified example:

sub good { my @a = qw/zero one two/; @a } sub bad { qw/foo bar baz/ } say scalar good(), "\t<== what I want"; say scalar bad(), "\t<== not"; __END__ Output: 3 <== what I want baz <== not

I can accomplish what I want with this clumsy mess:

{ my @temp = bad(); say scalar @temp; }

Is there no way to avoid the temp variable? And will Perl actually create a copy of the list in that case? My typical solution set is in the thousands.

As for the sub itself, it's someone else's XS code that I'm loathe to modify (not just something I can stick a wantarray in...), but I will if absolutely necessary.

Comment on Trouble getting size of list returned from sub
Select or Download Code
Re: Trouble getting size of list returned from sub
by tobyink (Abbot) on Nov 26, 2012 at 09:01 UTC

    "Is there no way to avoid the temp variable?"

    Kinda. It comes down to the difference between lists and arrays. Assigning an array to a scalar gives you the length of the array. Assigning a list to a scalar gives you the final item in the list.

    That said, Perl allows arrays to be anonymous. So even if you need an array, you don't need to create a variable for it...

    say scalar @{[ bad() ]};
    perl -E'sub Monkey::do{say$_,for@_,do{($monkey=[caller(0)]->[3])=~s{::}{ }and$monkey}}"Monkey say"->Monkey::do'

      Aha! That makes sense. Thank you (and to the others who replied with similar suggestions). Now, if I do this, can you tell me anything about what's going on under the hood? I.e., is there a performance hit for shoving a 1000's-long list into an anonymous array, and then dereferencing that array to get its length? Or is Perl smart enough to do all of that with one copy of the original list in memory?

        This does, indeed, create a copy of the array. In fact, in my testing it used more memory than just creating an array variable. Surprisingly (to me, anyway) the temporary variable method was the most memory-efficient.

        However, any option you are going to use incurs a lot of overhead that could be avoided if the sub were written to handle this.

        Here is a script I used to play around with memory usage:

        use Modern::Perl; use Win32::OLE qw/in/; sub memory_usage() { my $objWMI = Win32::OLE->GetObject('winmgmts:\\\\.\\root\\cimv2'); my $processes = $objWMI->ExecQuery("select * from Win32_Process wh +ere ProcessId=$$"); foreach my $proc (in($processes)) { return $proc->{WorkingSetSize}; } } sub big_list {return ('blah')x1_000_000}; #Option 1: memory usage 104,443,904 my $size = scalar @{[big_list()]}; #Option 2: memory usage 99,958,784 #my @arr = big_list(); #my $size = scalar @arr; #Option 3: memory usage 108,441,600 #my $size = () = big_list(); #Comparison: memory usage 11,206,656 (results discarded when not used) +. #big_list(); #my $size = 1; #Can't get what you want, obviously. say "Size: $size"; say 'Memory usage: ', memory_usage(), "\n";
        If you are not on Windows, see this Stackoverflow question, whence I got the memory usage sub, and which also gives some non-Windows options.


        When's the last time you used duct tape on a duct? --Larry Wall

        Without wantarray perl doesn't seem to reduce memory usage

        sub mm { print( (`pslist -m $$ 2>NUL`)[-2,-1] )} sub ff { my @fudge = 1 .. 1_000_000; @fudge } sub fa { scalar @{[ &ff ]} } mm; ff; mm; warn fa; __END__ Name Pid VM WS Priv Priv Pk Faults Non +P Page perl 796 62168 46336 44336 48196 12572 +2 34 Name Pid VM WS Priv Priv Pk Faults Non +P Page perl 796 66076 50280 48252 48260 13582 +2 34 1000000 at - line 4.

        Using the accumulator doesn't appear to reduce memory usage, because one function still returns a list

        sub mm { print( (`pslist -m $$ 2>NUL`)[-2,-1] )} sub ff { my @fudge = 1 .. 1_000_000; @fudge } sub fa { scalar( () = &ff ) } mm; ff; mm; warn fa; __END__ Name Pid VM WS Priv Priv Pk Faults Non +P Page perl 1432 62168 46336 44336 48196 12572 +2 34 Name Pid VM WS Priv Priv Pk Faults Non +P Page perl 1432 66076 50280 48252 48260 13582 +2 34 1000000 at - line 4.

        I suppose this could actually be optimized, I see no technical reason, but its fairly minor

      It comes down to the difference between lists and arrays. [....] Assigning a list to a scalar gives you the final item in the list.

      Oh, yet another falls victim to this seductive lie. It seems to explain so many things after you first "learn" it. Then you start applying it in more and more places and either just start putting out rather nonsense explanations or start dreaming up elaborate schemes about "that isn't really a 'list'". And then you get an emotional attachment to it and start yelling at people for calling things "list" when clearly they mustn't do that because that thing doesn't behave like you think a "list" should in a scalar context. But in the end, it mostly just gets in the way of deeply and accurately understanding the behavior of complex (and some simple) Perl constructs.

      The simpler truth is that just about everything in Perl that can return a list can also decide exactly what it wants to return in a scalar context. There is no "list vs. array" dichotomy in Perl. An array returns its size, that much is true. A list literal (aka. "the comma operator") returns its "last 'item'", where the exact definition of "item" belies the conflation of "list" and "list literal".

      What qw// returns in a scalar context actually depends on your version of Perl. Some versions of Perl implement qw// as a list literal and so, in those versions of Perl, qw// in a scalar context returns its last item. Other versions of Perl don't and return something else for qw// in a scalar context.

      qw// is just another example of a few constructs in Perl where "what should we return in a scalar context?" nobody bothered to design and so what we got was an accident of implementation details (or even optimizations).

      It seems to me that putting qw// into a scalar context is most likely to indicate a bug and so I wouldn't be surprised if a future version of Perl simply makes that fatal (or just a warning).

      Indeed, it is common for many non-array things in Perl to return "the last 'item'" in a scalar context (where the definition of 'item' is somewhat slippery, if you are paying close attention). And it is common to talk about many such non-array things as being "a list". But not all "lists" in Perl return their last item in a scalar context. And arrays actually are lists.

      The real "difference between lists and arrays" is that an array is a list that is stored in a variable (named or anonymous) and so can persist longer and allow more operations (like pop). The answer to "What to return in scalar context?" is a much more fine-grained than "is it an array or not?", despite it not appearing so when you first start looking.

      - tye        

Re: Trouble getting size of list returned from sub
by 2teez (Priest) on Nov 26, 2012 at 09:04 UTC

    this can also work:

    my $n = () = bad(); print $n; #print 4 sub bad { return qw/foo bar baz timon/; }

    If you tell me, I'll forget.
    If you show me, I'll remember.
    if you involve me, I'll understand.
    --- Author unknown to me

      You can also fold the variable into the same line as the print:

      say ( my $mp =()= bad() );

      But I wouldn't say that's an improvement. In fact, I would tuck all this away into a routine, to hide details and leave only the high-level purpose.

      To learn more about this ( the goatse operator ) and other fun, little-known operators, check out secret perl operators

      As Occam said: Entia non sunt multiplicanda praeter necessitatem.

Re: Trouble getting size of list returned from sub
by kcott (Abbot) on Nov 26, 2012 at 09:07 UTC

    G'day wanna_code_perl,

    You can achieve this by wrapping bad() in @{[...]}.

    $ perl -Mstrict -Mwarnings -E ' sub bad { qw/foo bar baz/ } say scalar bad(); say scalar @{[bad()]}; ' baz 3

    -- Ken

Re: Trouble getting size of list returned from sub
by ColonelPanic (Friar) on Nov 26, 2012 at 10:34 UTC
    Here is a solution that potentially uses less memory:
    my $size = scalar map {1} big_list;

    Explanation: All of the other options involve copying the return arguments into an array behind the scenes. This does not, instead creating an array with the same number of elements, but containing the value "1" for each element. This will use less memory if the the sub is returning a list of large items (such as long strings).

    This would be worth trying if you have problems with high memory usage.

    (To test this, see the sample code with memory usage test above)



    When's the last time you used duct tape on a duct? --Larry Wall

      Actually they all do that, they copy the list to the "stack", its just that map introduces a scope, by the time it ends, perl has a chance to release the memory it used back to the OS -- an optimization

      For a long time map/grep in scalar/void context took double the memory than equivalent for loop, so depending on perl version memory usage will differ between these two lines

      warn scalar grep {1} foo(); my $count = 0; $count++ for foo(); warn $count ;

      wantarray is what you want to use if you want to be sure :)

        Actually, it is not a scoping issue. By that logic, this should also save memory:
        sub get_size { my @a = big_list(); return scalar @a } my $size = get_size();

        However, in a simple test, that uses the same memory as a global array that does the same thing.

        Perl, generally speaking, does not release memory back to the OS unless your system is running out of memory. The memory from lexical variables can be claimed and reused by Perl, but it doesn't go back to the OS.

        All of the other methods created an array in addition to what was copied on the stack. map only creates an array of 1s in addition to what was copied on the stack; thus it uses less memory (unless your original data is no bigger than integers, of course) (and yes, this was not always true in older versions of Perl).



        When's the last time you used duct tape on a duct? --Larry Wall
Re: Trouble getting size of list returned from sub
by DrHyde (Prior) on Nov 26, 2012 at 12:24 UTC
    $ perl -E 'sub foo { qw(a b c d) } say $#{[foo()]} + 1'
Re: Trouble getting size of list returned from sub
by brx (Pilgrim) on Nov 26, 2012 at 13:24 UTC

    In perldata, you can see : If you evaluate an array in scalar context, it returns the length of the array. (Note that this is not true of lists, which return the last value....

    Your sub bad() returns the value of the last expression evaluated (see return). The last and unique expression in bad() is a list :

    • list in scalar context -- each element of this list is evaluated and the last one is returned
    • list in list context -- bad() returns the list (3 elements) then scalar counts the number of elements. List context can be forced with  () = ... (see examples).

    use strict; sub good { my @a = qw/zero one two/; @a } sub bad { qw/foo bar baz/ } print scalar good(), "\t<== what I want"; print "\n"; print scalar bad(), "\t<== not"; print "\n"; print scalar ( () = bad()), "\t<== list context, then scalar()"; print "\n"; print "baz is ".bad(); print "\n"; print "1+2 = ".( () = bad()); print "\n"; __END__ Output: 3 <== what I want baz <== not 3 <== list context, then scalar() baz is baz 1+2 = 3
    English is not my mother tongue.
    Les tongues de ma mère sont "made in France".
Re: Trouble getting size of list returned from sub
by Don Coyote (Monk) on Nov 26, 2012 at 13:41 UTC

    Could adding an evaluation to the end of the sub something like;

    sub gettinglist{ $ListItemReturnedCount++ if { Item retrieved }; return $ListItemReturnedCount; }

    be a possible alternative? Ok, perhaps if you are debugging but not for production as a very unexpectedly short list would be returned to a calling routine somewhere. So, seeing as the sub is returning a list anyway. Could you not retrieve the number from some other place, for example from the sub/routine to where the list is called from?

      Of course this could be easily solved if one can edit the subroutine. However, as the poster noted, it's someone else's code and an XS sub, not a Perl sub.


      When's the last time you used duct tape on a duct? --Larry Wall
Re: Trouble getting size of list returned from sub
by ColonelPanic (Friar) on Nov 26, 2012 at 14:43 UTC
    Yet another solution that uses less memory than a temporary array:
    sub get_size {scalar @_} my $size = get_size(big_list());

    Using the list returned from the sub as the args for another subroutine call is a sneaky way to get an array without actually making a copy.

    It makes for pretty intuitive code, too. And it probably isn't limited to recent Perl versions, as the map solution is.



    When's the last time you used duct tape on a duct? --Larry Wall

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://1005594]
Approved by kcott
Front-paged by 2teez
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others studying the Monastery: (10)
As of 2014-10-01 10:03 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    What is your favourite meta-syntactic variable name?














    Results (3 votes), past polls