Beefy Boxes and Bandwidth Generously Provided by pair Networks
Keep It Simple, Stupid
 
PerlMonks  

Shift versus Sanity

by tadman (Prior)
on Apr 23, 2002 at 22:25 UTC ( [id://161471]=perlmeditation: print w/replies, xml ) Need Help??

What's your opinion on the use of shift? Some have mentioned that during a code review, you can lose points for using it to extract function parameters. Others seem to be utterly infatuated with it, perhaps even to the point of perversion. No matter how useful I find it, there are still things that bother me, if only psychologically:
sub foo { return shift()->{foo}; }
Seems only a few operators away from fully-fledged obfuscation, but it works, no? That's the annoying part. The other annoying part is how well it works.


Out of the following, which one is the fastest?
package Foo; sub with_shift { my $self = shift; return $self->{foo}; } sub with_inline_shift { return shift()->{foo}; } sub with_index { return $_[0]->{foo}; } sub with_list { my ($self) = @_; return $self->{foo}; }
Ideally, which is a euphemism for "in that place where all is well and good, but you can't get there from here", the named parameter method would be the fastest. Unfortunately, this is just not the case. with_inline_shift() is always the fastest, regardless of the number of parameters passed. In fact, it is about 30% faster with one parameter, and at least 10% faster when loaded with 1000 parameters.

The performance curve of with_inline_shift is very similar to that of with_index, presumably because both of them have no local variables to declare. The other two, with_list and with_shift are similarly slower. my is pretty darned expensive, don't you think?

Is this a case of "nice guys finish last" or what?

Replies are listed 'Best First'.
Re: Shift versus Sanity
by Juerd (Abbot) on Apr 23, 2002 at 22:51 UTC

    Style, readability and maintainability are much more important in most cases. That's why I always assign @_ to a list of lexicals, unless I'm coding a one-liner. In classes for tie, I often use shift and pop in examples.

    sub TIESCALAR { bless \(my $foo = pop), shift; } sub STORE { ${ +shift } = pop } sub FETCH { ${ +shift } } ### sub foo { my ($self, $foo) = @_; ... } sub foobar { my ($self, %options) = @_; ... }
    If I use @_ and the sub is small (5 lines or less), I do shift the object and then use @_, because that saves me an array. Besides, sometimes you want to change the values, in which case you have to use @_.

    Some examples of things that I do not like:
    # Bad sub foo { my $self = shift; ... not using @_ anymore } sub bar { my $self = shift; my %options = @_; ... } # Worse sub foo { my $self = shift; my $foo = shift; my $bar = shift; ... } # Awful sub foo { my $self = shift; my $foo = shift; ... my $bar = shift; ... }
    I dislike anything using more than one shift.

    - Yes, I reinvent wheels.
    - Spam: Visit eurotraQ.
    

      # Bad sub bar { my $self = shift; my %options = @_; }
      I have to disagree with you on this one, sort of. I use this construct to set defaults to named parameters like this:
      sub bar { my $self = shift; my %options = (opt1=>'foo', # desc for opt1 opt2=>'bar', # desc for opt2 @_); }
      To me, this is the best of both worlds. I get an obvious object ($self), I know what my expected named arguments are, their default values, and a brief description of each parameter. Yes, it's slightly more memory intensive, but it's so much more intuitive to me that I'll take the tradeoff anyday.

      As someone else said, I'll take maintainability over a little verboseness & overhead anyday. Being able to figure out what the heck a rarely used option does is invaluable 4 months (or years!) down the road.

        I use this construct to set defaults to named parameters like this

        Oops, forgot about that idiom. Yes, when using to set defaults, it is okay :)

        - Yes, I reinvent wheels.
        - Spam: Visit eurotraQ.
        

      sorry juerd, can't agree. another common idiom that noone's mentioned is this:

      sub do_stuff { my $this = shift; my $session = shift || die "Missing req'd Session argument"; my $cgi = shift || new CGI (); ... }

      aesthetics aside, here are some hard figures on efficiency (slightly reformatted):

      [matt@dirty matt]$ ./test_arg_passing.pl Benchmark: timing 2000000 iterations of argumentative_1arg, argumentative_3args, direct_1arg, direct_3args, shifty_1arg, shifty_3args... argumentative_1arg: 17 wallclock secs (17.69 usr + 0.01 sys = 17.70 CPU) @ 112994.35/s (n=2000000) argumentative_3args: 21 wallclock secs (21.93 usr + 0.02 sys = 21.95 CPU) @ 91116.17/s (n=2000000) direct_1arg: 10 wallclock secs (10.86 usr + 0.04 sys = 10.90 CPU) @ 183486.24/s (n=2000000) direct_3args: 13 wallclock secs (12.24 usr + 0.02 sys = 12.26 CPU) @ 163132.14/s (n=2000000) shifty_1arg: 17 wallclock secs (18.07 usr + 0.03 sys = 18.10 CPU) @ 110497.24/s (n=2000000) shifty_3args: 23 wallclock secs (23.91 usr + 0.01 sys = 23.92 CPU) @ 83612.04/s (n=2000000) total: 42000000

      not much of a difference, though the triple shift example is a bit pathological. here's the code for the above:

      #!/usr/bin/perl -w use Benchmark; use strict; package ArgTest; sub shifty_1arg { my $this = shift; my $first_arg = shift; $$this += $first_arg; } sub shifty_3args { my $this = shift; my $first_arg = shift; my $second_arg = shift; my $third_arg = shift; $$this += $first_arg + $second_arg + $third_arg; } sub argumentative_1arg { my( $this, $first_arg ) = @_; $$this += $first_arg; } sub argumentative_3args { my( $this, $first_arg, $second_arg, $third_arg ) = @_; $$this += $first_arg + $second_arg + $third_arg; } sub direct_1arg { ${$_[0]} += $_[1]; } sub direct_3args { ${$_[0]} += $_[1] + $_[2] + $_[3]; } sub value { return ${$_[0]} } package main; my $total; bless( my $object = \$total, 'ArgTest' ); my @args = ( 1 .. 3 ); timethese( 2_000_000, { shifty_1arg => sub { $object->shifty_1arg( @args ) }, argumentative_1arg => sub { $object->argumentative_1arg( @args ) }, direct_1arg => sub { $object->direct_1arg( @args ) }, shifty_3args => sub { $object->shifty_3args( @args ) }, argumentative_3args => sub { $object->argumentative_3args( @args ) }, direct_3args => sub { $object->direct_3args( @args ) }, } ); print "total: " . $object->value . "\n";

      matt

        sub do_stuff { my $this = shift; my $session = shift || die "Missing req'd Session argument"; my $cgi = shift || new CGI (); ... }

        As said, I think that's horrible.

        sub do_stuff { my ($self, $session, $cgi) = @_; croak 'Session not optional' unless $session; $cgi ||= CGI->new(); ... }
        I call my objects $self, not $this. You can see the three arguments in a single line, instead of spread over three.

        esthetics aside, here are some hard figures on efficiency (slightly reformatted)

        I am starting to think that you didn't read my post, and are only commenting on the piece of code. Efficiency is important, but not more important than readability and maintainability. Whenever efficiency is important, you probably should not be using OO.

        - Yes, I reinvent wheels.
        - Spam: Visit eurotraQ.
        

(tye)Re: Shift versus Sanity
by tye (Sage) on Apr 24, 2002 at 04:18 UTC

    If the function takes a simple list of arguments with perhaps some trailing arguments being optional, then I will often write:

    sub simple { my( $this, $that, $other )= @_;
    just because it is simple, takes up little space, and is easy to read.

    But if my argument handling is more advanced, or I find myself changing what arguments the function accepts, or I feel a need to add comments, or for probably quite a few other reasons, I will instead use something much closer to:

    sub complex { my $this= shift(@_); my $that= shift(@_); my $other= shift(@_);
    because it is much easier to make changes to.

    Note that I don't use a bare shift mostly because I really like to be able to scan for just "@_" in order to see where any argument handling is happening. I don't want to scan for that plus "shift", "pop", $_[, and several others. I also like that it makes the code a bit easier for non-Perl programmers to read (which makes it easier to have the code accepted by coworkers and managers) and clearly documents that I didn't write the code thinking I was dealing with @ARGV and then later moved it into a subroutine and broke it.

    I also use the asymetrical spacing around the assigment operator to make it clearer that I didn't mean to write == (no, that isn't a likely source of confusion for this code, but you have to follow that convention for all assignments for it to work well).

    And I don't line up the expressions like:

    sub pretty { my $first= shift(@_); my $second= shift(@_); my $optional= shift(@_);
    as I think this scales really poorly when you decide to rename some variable and suddenly you feel obliged to reindent a bunch of nearby parts of lines, especially since no editor I've seen comes even close to automating or even assisting much in such primping.

    And I certainly don't prematurely nano-optimize for run-time speed since development time is usually much more important and run-time speed is usually much more improved by careful algorithm design than such pre-benchmarking.

    :)

            - tye (but my friends call me "Tye")
      Howdy!

      perltidy will gladly re-align those assignments for you. I've got it integrated into my favorite text editor (under Solaris -- nedit) to do the whole file or the selection.

      Mind you, I'm not real fussy about that detail, myself.

      ObThread: for OO stuff, I start with my $self = shift; and go from there. Sometimes I capture @_; other times I do more shift-ing. It depends...

      yours,
      Michael

(jeffa) Re: Shift versus Sanity
by jeffa (Bishop) on Apr 23, 2002 at 23:16 UTC
    I concur with Juerd. Sometimes i will obfuscate, but only if it is for me and me only.

    How many times have you been bitten with this:

    sub foo { my $foo = shift; } # and then you decide to add another parameter: sub foo { my ($foo,$bar) = shift; }
    I find it better to always use @_ to prevent those annoying mistakes:
    sub foo { my ($foo) = @_; }

    jeffa

    L-LL-L--L-LL-L--L-LL-L--
    -R--R-RR-R--R-RR-R--R-RR
    B--B--B--B--B--B--B--B--
    H---H---H---H---H---H---
    (the triplet paradiddle with high-hat)
    
Re: Shift versus Sanity
by mattriff (Chaplain) on Apr 24, 2002 at 00:22 UTC
    Okay, since nobody else has said it yet, I will: I like shift(). :) Honestly, I'd never thought of it as possibly being obfuscated until I read this thread.

    If I'm working in an OOP module I habitually use shift() to get the object then use @_ for the rest of the parameters, like:

    sub object_method { my $self = shift; my ($foo,$bar,$baz) = @_; }

    In my mind it's a pretty clean separation of things.

    However, I suppose I'll admit that I have been bitten by this (as jeffa asked):

    my ($foo,$bar) = shift;

    More than once. ;) Usually I get irritated, jump up, amble to the other side of my cubicle wall to interrupt the marketing-type guy (revenge for the 6-7 times he'll be at my desk), head back, look at the code, slap my forehead, fix it, and move on.

    - Matt Riffle

(kudra: obj shift) Re: Shift versus Sanity
by kudra (Vicar) on Apr 24, 2002 at 11:07 UTC
    Sometimes shift is just the nicest way to get the job done. For example:
    my $self = shift; my %hash = @_;
    I believe that makes more sense than the alternatives. $self doesn't really have anything to do with the rest of the arguments, so it is logical to take it off first and then deal with the other options.

    I probably wouldn't use shift to get any of the other arguments, however. It's probably clear from the example that I prefer named arguments, and shift is not a good way to deal with them. Update: A rant about named arguments can now be found right here in the page instead of by viewing the page source.

    Named arguments are good if you have a lot of arguments and don't want to look up the order every time you make a function call.

    But the main reason I use them is that it is expandable. It looks a bit silly when you only have one or two arguments, but it has paid off for me often enough. Compare these two function calls:

    mysub(name => 'foo', number => 'three'); mysub('foo', undef, undef, undef, 'three');
    In one, it's very easy to see what arguments are being used. In the other, you'd be lucky to get your arguments in the right positions.

    Of course, if you're still writing the code that uses this function, you can change all your function calls so it is possible to re-arrange the order of your arguments to put the optional ones at the end. But what if they're all optional? Or what if the code is already in production, but needs to be expanded for new uses without breaking the old uses? The most readable result will most likely be the one that uses named arguments.

    All in all, I'd rather take the risk that the function never needs more arguments and I have to type a few extra words with every function call than that it does expand and I can't easily accomodate it.

    Of course I don't consider it a hard rule. Plenty of my code doesn't use named arguments. But unless I have a good reason not to, I do use them.

Re: Shift versus Sanity
by VSarkiss (Monsignor) on Apr 24, 2002 at 00:00 UTC

    I almost always prefer the my ($self) = @_; style in general, but there are circumstances where you really want to use shift, such as when you're going to pass your own arguments down to another routine:

    my ($self) = shift; helper_sub(@_);
    It's really a question of judicious choices: function first, performance second. Cargo-cult programming third ;-)

Re: Shift versus Sanity
by rinceWind (Monsignor) on Apr 24, 2002 at 10:05 UTC
    These days, I tend to use @_ exclusively. However, there's one circumstance when shift wins: optional arguments. How about the classic get-and-set method for an accessor method of a scalar data item:
    sub foo { my $self = shift; @_ ? ($self->{foo} = shift) : $self->{foo}; }
    However, as a general rule for optional arguments, I tend to use a named parameter scheme, passing in key=>value pairs cast internally to a hash.
    &blah( foo=>2, bar=>'Fred');
Re: Shift versus Sanity (shift is not like assignment from @_)
by grinder (Bishop) on Apr 24, 2002 at 18:15 UTC
    Wow, lots of very interesting arguments to both sides of the debate, but no-one appears to have made the point that:
    my( $foo, $bar ) = @_;
    ...is not the same as...
    my $foo = shift; my $bar = shift;
    The latter approach modifies the parameter list, they are gone. In the former case, they stay around to haunt you, especially if someone else calls a &foo, and what's left of @_ gets passed along. Talk about effects at a distance. One alternative is to never call &foo, but foo() instead. The other alternative is to use the shift approach. (note to self: remember to adopt tye's approach to fetching parameters).

    If the routine is small enough I use $_[0] (in which case what $_[0] should contain should be easy to infer from the sub's name). But not everything can be done with $_[0]. If you want to modify it you must fetch the parameter, viz:

    sub x { $_[0] =~ s/foo/bar/; $_[0]; } print x('food'), "\n"; # does not work

    print@_{sort keys %_},$/if%_=split//,'= & *a?b:e\f/h^h!j+n,o@o;r$s-t%t#u'
      If you want to modify it you must fetch the parameter

      I don't know if you include this in "fetch". Technically you might fetch, but I wouldn't like to call it fetching the parameter(s).

      sub foobar { @_ = @_; $_[0] =~ s/foo/bar/; $_[0]; }
      Cheers,
      -Anomo
Re: Shift versus Sanity
by Necos (Friar) on Apr 24, 2002 at 01:21 UTC
    I don't think shift is bad. I find myself using it in a lot of code, especially modules that I write for office use. In some subroutines, I find myself using 3-5 shifts. In one of my modules (Student::Server), I've done this:
    sub usr_create { my $DC; Win32::NetAdmin::GetAnyDomainController('','CLC', $DC); my $obj = shift; my $usr_log_file = shift; my $params = shift; if ( $obj->usr_log($usr_log_file,$params) ) { return 0; } my $usr = $obj->_rand_user(); if ( not Win32::NetAdmin::UsersExist($DC,$usr) ) { $obj->usr_add($usr); } else { $obj->usr_create($usr_log_file,$params); } }
    Most people would probably frown upon that usage, but I like it. If I only need one parameter (per variable), then it's much easier to avoid confusion about parameters. In the above sub, I expect an object, a filename, and an array ref (which contains the order of logging parameters). In other words, a call to usr_create would look like: $obj->usr_create('test.txt', [Last_Name, First_Name, State]); or even  $obj->usr_create('test.txt', $aref); shift'ing out the necessary parameters avoids silly things like having a user pass too many arguments to the sub. In that case, anything not shift'ed out is discarded.

    Theodore Charles III
    Network Administrator
    Los Angeles Senior High
    4650 W. Olympic Blvd.
    Los Angeles, CA 90019
    323-937-3210 ext. 224
    email->secon_kun@hotmail.com
    perl -e "map{print++$_}split//,Mdbnr;"
      This is one of those examples of things that bothers people, myself included. Three shifts, and no use of @_ to justify it, really. Instead, you could just declare them in a single my and be done with it, like:
      my ($obj, $usr_log_file, $params) = @_;
      Extra params passed by the user are discarded, as one would expect.

      The reason being shifty is annoying is because it can degenerate into nonsensical situations like this:
      sub foo { my $self = shift; my $foo = shift; $self->something($foo, "bar", shift, "shift", "foo"); my ($flob,$blarg,$kvetch) = (shift,shift); if ($kvetch = $flob->fnordicate($blarg)) { shift()->refnordicate($kvetch); } }
      What, exactly, are the parameters to this function? You have to read and understand the entire function before it becomes clear. If this were much larger, that could be very frustrating. Unfortunately, this fictional example is not too far fetched.
        "shift() doesn't confuse people. People confuse people."

        I agree that the example given would be nonsensical. However, if I saw it, I wouldn't blame shift(). I'd blame the programmer who abused it in that way.

Re: Shift versus Sanity
by demerphq (Chancellor) on Apr 24, 2002 at 11:02 UTC
    Interesting post, ++ to you. I found it particularly relevant as I have been working on a post/request for comments (coming soon to Meditations :-) on a class/object technique that I have started using a lot
    sub my_method { my ($self,$data)=shift->self_obj; }
    Although speed was and is definately not my motivation, more the elgance of the notation. Although I am pleased to hear that it is in fact more efficient. :-)

    Other than that in general I agree with tye pretty much on this one. I often use shift as it is easier to reorganize during development, easier to document and much easier to provide defaults for.

    Of late I've found myself using $_[$x] a lot as well, primarily as I have been working on a few things where either of the other two techniques can cause problems as they copy their values and not alias to them.

    Oh yes, when I first started learning perl I reviewed a bunch of the standard modules and corresponded with GSAR about why he (more or less) exclusively used the my $x=shift; idiom (at least in Data::Dumper). Turned out there is/was a bug in an older version of perl that the shift technique worked around.

    Yves / DeMerphq
    ---
    Writing a good benchmark isnt as easy as it might look.

Re: Shift versus Sanity
by Biker (Priest) on Apr 24, 2002 at 08:17 UTC

    The only Perl statement I refuse to use is goto()


    Everything went worng, just as foreseen.

      Hmm... goto &func is really, really useful in an AUTOLOAD. I agree that goto LABEL is really bad. And its lots faster and slightly better structured to do redo LABEL instead.

      The redo LABEL trick is something I'm using in a Scheme implementation I'm working on. It's sort of horrible, but it buys me tail call optimization and (coming soon) 'real' continuations, so I'll make the trade.

        Gah! Note to self: Always log on before commenting.
Re: Shift versus Sanity
by phil_g (Initiate) on Apr 24, 2002 at 15:47 UTC

    I think it's largely a stylistic choice. I tend to prefer my ($foo, $bar) = @_;. I did, at one point, do timing tests on @_ versus shift. For my test data (which consisted of object references and strings), I found that shift was generally faster for one and two parameters, both approaches were roughly the same for three parameters, and @_ was faster for more than three parameters.

Re: Shift versus Sanity
by zakzebrowski (Curate) on Apr 24, 2002 at 11:56 UTC
    Wow, my mind is spinning. I started using $x=shift; $y=shift; because it allowed me similiar syntax to c (eg int a = argv[0]; // Or similar.. forgot c.) and the benefits of having optional parameters. I've more or less ignored @_ because it wasn't in english... Sweet. Nice to learn things every now and then... :)

    ----
    Zak

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlmeditation [id://161471]
Approved by Ovid
Front-paged by broquaint
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others cooling their heels in the Monastery: (7)
As of 2024-03-19 02:29 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found