Beefy Boxes and Bandwidth Generously Provided by pair Networks
Problems? Is your data what you think it is?

Re: Memory Use and/or efficiency passing scalars to subs

by BrowserUk (Pope)
on Aug 31, 2003 at 05:24 UTC ( #287984=note: print w/replies, xml ) Need Help??

in reply to Memory Use and/or efficiency passing scalars to subs

The optimum way of passing arguments to subs, especially if you are going to modify the arguments, is always to use the by-reference aliases that perl provides by default, rather than copying the arguments to named locals or by passing references to the arguments.

The significance of the difference however, depends very much on what you are doing within the sub.

If the sub is doing anything substantial, then using named references, whether these are done explicitly by the caller, implicitly by perl (through prototypes) or explicitly within the called sub all run a very close second to using the aliases perl gives you, with the minor differences between the methods of generating the references being completely insignificant and within the bounds of 'experimental error', sometimes switching places between runs of the same benchmark.

Obviously, copying the arguments, once on the way in and once on the way out, it always the least optimal. (Is that pessimal?).

Large scalars

Rate copied proto caller direct called NAMED copied 105/s -- -52% -62% -62% -62% -62% proto 219/s 109% -- -20% -21% -21% -21% caller 274/s 162% 25% -- -1% -1% -2% direct 277/s 164% 26% 1% -- -0% -1% called 277/s 164% 26% 1% 0% -- -1% NAMED 279/s 166% 27% 2% 1% 1% --

In the above results on large scalars, you can see that copying sub x{ my( $a1, $a2 ) = @_; ... } is much slower as you would expect.

Using references or aliases is all much of a muchness, except when they are generated using a prototype which for some reason is significantly slower. The only explanation I can think of for this is that the interpreter has to look up the prototype and that becomes significant--by I'll admit that it doesn't make much sense and is a guess anyway.

The My conclusion

If your scalars are large, and by implication (though it wouldn't always be true) the amount of processing within the sub is substantial, then using references or aliases makes little differences.

Small scalars

However, if the sub is a convenience sub, used to clarify the calling code by naming a fairly simply operation that is performed in many places, or a method used to maintain OO-integrity of abstraction by indirecting access to the underlying data structures (think getters and setters), then the overhead of taking and naming references can become significant. In this case, using the aliases perl provides rather than taking and naming your own references is more optimal to a point that it can become worthwhile.

Rate copied called caller proto NAMED direct copied 15741/s -- -54% -57% -66% -74% -74% called 34560/s 120% -- -5% -25% -43% -44% caller 36557/s 132% 6% -- -21% -39% -41% proto 46058/s 193% 33% 26% -- -24% -25% NAMED 60268/s 283% 74% 65% 31% -- -2% direct 61563/s 291% 78% 68% 34% 2% --

In these results, even though the scalars are small (40/10 chars, 1st/2nd arg), avoiding the copying is still beneficial if you are calling the sub many times. However, avoiding taking references by using the aliases that perl provides, can substantially increase that benefit, as can be seen by the last two results versus the second and third.

Quite why using a prototype on small scalars would come out to be so beneficial relative to using a prototype on large scalars above, I am at a loss to explain. This probably indicates a flaw in the benchmark, but I've spent an inordinate amount of time trying to track it down and can't. So, I've washed my face in preparation for the egg I'm going to be wearing:)

The My conclusion

Using the aliases is worthwhile for low-impact /high-use subs and methods.

Arguments against

As far as I am aware, the only arguments against using the aliases are:

  • Aesthetics.
  • Readability.
  • Maintainability.

IMO, these are effectively the same argument.

One of the my criteria for judging source code to be 'aesthetic', is the ease of reading it. By this, I mean slightly more than just perceiving the symbols, it's more to do with being able to quickly grasp the intent of the code easily. If this is true, then the code is readable and maintainable, and therefore aesthetically pleasing.

To this end, I've included NAMED_args() in the benchmark. It probably should have been called NAMED_direct() or NAMED_alias() but that threw the presentation of the benchmark out.

Basically, this is using the aliases in $_[0], $_[1] etc., but using constants to give them meaningful names.

use constant { STRING=>0, NUMS=>1, }; sub foo{ $_[STRING] =~ tr[...][...] if $_[NUMS] == '123'; }

Having your cake and eating it

I contended a while ago (though few agreed with me:), that this is useful to get readability and maintainability whilst retaining the performance of using the aliases. In effect, this is akin to and achieves some of what Perl 6 achieves with the binding operator (:=). Ie. The naming of the aliases. Giving the benefit of working with named entities rather than numerical referenced, anonymous one, but retaining the (performance) benefits of aliasing.

The aliasing happens anyway, all this does is give you a way of making best use of it without descending into the nightmare of unmaintainable code.

Of course, it doesn't address the issue of positionality, but Perl 6 is coming and we'll have to wait for the icing:).

Full benchmark

#! perl -slw use strict; use vars qw[ $VALIDATE ]; use Benchmark qw[ cmpthese ]; sub copied_args { my( $a1, $a2 ) = @_; $a1 =~ tr[A-Za-z][N-ZA-Mn-za-m]; $a2 =~ tr[0-9][9876543210]; return $a1, $a2; } sub proto_refs ($$) { my( $a1, $a2 ) = @_; $a1 =~ tr[A-Za-z][N-ZA-Mn-za-m]; $a2 =~ tr[0-9][9876543210]; return; } sub caller_refs { my( $a1, $a2 ) = @_; $$a1 =~ tr[A-Za-z][N-ZA-Mn-za-m]; $$a2 =~ tr[0-9][9876543210]; return; } sub called_refs { my( $a1, $a2 ) = \( @_ ); $$a1 =~ tr[A-Za-z][N-ZA-Mn-za-m]; $$a2 =~ tr[0-9][9876543210]; return; } sub direct_args { $_[0] =~ tr[A-Za-z][N-ZA-Mn-za-m]; $_[1] =~ tr[0-9][9876543210]; return; } use constant { STRING=>0, NUMS=>1 }; sub NAMED_args { $_[STRING] =~ tr[A-Za-z][N-ZA-Mn-za-m]; $_[NUMS] =~ tr[0-9][9876543210]; return; } our $small_text = 'The Quick Brown Fox Jumps Over The Lazy Dog'; our $large_text = $small_text x 1000; our $small_nums = '0123456789'; our $large_nums = $small_nums x 1000; if( $VALIDATE ) { ( $small_text, $small_nums ) = copied_args $small_text, $small_nums; print $small_text, ' : ', $ +small_nums; ( $small_text, $small_nums ) = copied_args $small_text, $small_nums; print $small_text, ' : ', $ +small_nums; caller_refs \$small_text, \$small_nums; print $small_text, ' : ', +$small_nums; caller_refs \$small_text, \$small_nums; print $small_text, ' : ', +$small_nums; called_refs $small_text, $small_nums; print $small_text, ' : ', +$small_nums; called_refs $small_text, $small_nums; print $small_text, ' : ', +$small_nums; proto_refs $small_text, $small_nums; print $small_text, ' : ', +$small_nums; proto_refs $small_text, $small_nums; print $small_text, ' : ', +$small_nums; direct_args $small_text, $small_nums; print $small_text, ' : ', +$small_nums; direct_args $small_text, $small_nums; print $small_text, ' : ', +$small_nums; NAMED_args $small_text, $small_nums; print $small_text, ' : ', +$small_nums; NAMED_args $small_text, $small_nums; print $small_text, ' : ', +$small_nums; } else { cmpthese( -3, { 'copied_small' => q[ ( $small_text, $small_nums ) = copied_args $small_text, $small_nums; +], 'caller_small' => q[ caller_refs \$small_text, \$small_nums; +], 'called_small' => q[ called_refs $small_text, $small_nums; +], ' proto_small' => q[ proto_refs $small_text, $small_nums; +], 'direct_small' => q[ direct_args $small_text, $small_nums; +], ' NAMED_small' => q[ NAMED_args $small_text, $small_nums; +], }); cmpthese( -3, { 'copied_large' => q[ ( $large_text, $large_nums ) = copied_args $large_text, $large_nums; +], 'caller_large' => q[ caller_refs \$large_text, \$large_nums; +], 'called_large' => q[ called_refs $large_text, $large_nums; +], ' proto_large' => q[ proto_refs $large_text, $large_nums; +], 'direct_large' => q[ direct_args $large_text, $large_nums; +], ' NAMED_large' => q[ NAMED_args $large_text, $large_nums; +], }); }


  • tr// and rot13.

    I used the rot13 thing because its reversability allowed me to re-use the same test arguments for all cases, and because it's runtime cost is almost entirely proportional to the length of the arguments.

    This allows the non-overhead costs of each sub to be identical and almost completely linear with the size of the arguments.

  • Prototypes.

    My comments regarding the apparent variation on the cost of using a prototype to dereference the arguments are probably worthless. I do not have an explaination for the apparent non-linearity of this. The comments are left on to see what (if any) alternative explanantions they might prompt.

  • Strings and numbers (as a string).

    I appreciate that using and differenciating between the two parameters on the basis that one consists of alpha characters and one numeric is completely spurious. They serve only to provide the benchmark with multiple arguments as in the original question and nothing more.

  • Titles of testcases.

    Anyone noting that the titles of the benchmark results show are different to those in the benchmark itself should know that the change took place in my editor as I prepared this post, because the PM's code wrap 'feature' munged the results to the point where they were unreadable. The numbers are as they were generated on my pc.

    Of all the (mostly minor) irritations of using PM, the overzealous and arbitrary wrapping of code blocks is the one I most love to hate. Maybe that would make a good subject for a poll!

Examine what is said, not who speaks.
"Efficiency is intelligent laziness." -David Dunham
"When I'm working on a problem, I never think about beauty. I think only how to solve the problem. But when I have finished, if the solution is not beautiful, I know it is wrong." -Richard Buckminster Fuller
If I understand your problem, I can solve it! Of course, the same can be said for you.

Replies are listed 'Best First'.
Re: Re: Memory Use and/or efficiency passing scalars to subs
by knexus (Hermit) on Aug 31, 2003 at 16:41 UTC
    Wow, thanks for the detailed response. Being new to perl it will take me a while to fully digest it.

    However, if I understand it correctly using $_[0] certainly can't hurt. I think I like the approach of using a constant to help with naming.

    I suppose I will eventually have some style in perl. Thanks

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://287984]
and the questions are moot...

How do I use this? | Other CB clients
Other Users?
Others pondering the Monastery: (6)
As of 2018-05-21 05:51 GMT
Find Nodes?
    Voting Booth?