Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl: the Markov chain saw

Re: Why doesn't Perl provide %_ as the hash equivalent of @_ in subs? (ugly++)

by tye (Sage)
on Sep 27, 2013 at 04:19 UTC ( #1055925=note: print w/replies, xml ) Need Help??

in reply to Why doesn't Perl provide %_ as the hash equivalent of @_ in subs?

makes my code less sexy

That is such a common motivation for the use of "worst practices". I don't want sexy code. I want clear code. Ugly is often a good thing. Perl is full of ugly sigils, which is something I liked from the beginning. I find they can greatly aid clarity and speed visual parsing of code.

But the worst thing that the desire for sexy code does is motivate the use of huge piles of problematic hidden complexity just to get slightly "cleaner" syntax. That leads to source filters and similar madness. I've read Devel::Declare documentation that goes to great lengths to talk about how stupid of an idea a Perl source filter is. And then the developers implement something almost identical to a source filter (and having the same level and type of problems). And their motivation for such a stupid thing? They didn't want you to have to type one single "extra" character (a semicolon).

I urge you to get over the desire to be aroused by reading source code.

I wouldn't want %_. Unpack your arguments into lexical variables at the top of the sub. Typing $_{whatever} over and over doesn't detect when you typo the key name. Please oh please don't require people to search the entire code of your routine in order to figure out what parameters can be passed to it!

Not too long ago we spent some time standardizing argument passing for our team's Perl best practices. We standardized on passing named arguments in this manner:

whatever( { user_id => $user, campaign => $cmp } );

You'll note the inclusion of ugly brace characters, { and }. Part of the reason for that is because those tell the person reading the Perl code that the stuff inside is key/value pairs forming a real hash. For example, that makes it clear that order cannot matter. It also tells Perl that these are real name/value pairs.

We also do that because it helps avoid the temptation of extending interfaces in ambiguous ways. For example, we separate the key/value pairs for DB records from key/value pairs for optional method arguments from the tiny number of simple, required method arguments (if any):

$account->add_client( $db, # DBI handle { name => $client, status => 'new' }, # Data { require_unique => 1 }, # Options );

We explored a bunch of possible interfaces for providing a helper for unpacking such arguments. But no interface was other than trivially shorter than the below. And they were all significantly less flexible and less obvious (and each had different, worse problems).

sub add_client { my( $db, $rec, $opt, @bad ) = @_; $opt ||= {}; my $uniq = delete $opt->{require_unique} || 0; my $fatal = delete $opt->{fail_fatal} // ! $uniq; my $contract = delete $opt->{contract}; my $context = _context( delete $opt->{referrer} ); argue( \$opt, \@bad, { db => $db, rec => $rec } );

argue() just complains about unknown options, too many arguments, or (the simpler cases of) missing required arguments, filling in the sub/method name and the calling line of code for you.

Those few ugly, somewhat repetitive lines have the advantage of being ordinary, real Perl code. So they support things you might want like comments. The default values can be any Perl code (including based on other arguments). You can distinguish between || and // (and other tests). You can preprocess arguments. And all of these things are completely clear to anybody who can read Perl code. Yet most of the time we have just 1 line per option.

Yep, you can certainly consider it boilerplate. But it is clear, informative, flexible boilerplate. Ugly and so very useful.

- tye        

Replies are listed 'Best First'.
Re^2: Why doesn't Perl provide %_ as the hash equivalent of @_ in subs? (ugly++)
by smls (Friar) on Sep 27, 2013 at 09:37 UTC
    "Please oh please don't require people to search the entire code of your routine in order to figure out what parameters can be passed to it!"

    Yes, it wouldn't make sense to use %_ in most long & complicated routines.

    I never intended to suggest that it would become The One True WayTM for accessing subroutine arguments.

    I just think it would be a useful additional tool in the Perl developer's toolbox, nicely complementing the existing ones (in particular @_ which it would semantically mirror very closely), to be deployed with discretion.

    I myself would use %_ sparingly, just like I currently use direct access to $_[0] (or, similarly, $_ in foreach loops) sparingly. But sparingly does not mean "never useful"!
    Remember that Perl is also popular for shell one-liners & small system administration/convenience scripts etc., not just for huge professional web applications. And not every subroutine is so big and complicated, that direct usage of @_ or %_ would get lost in the noise or make it impossible to predict unwanted side-effects.


    In the OP I've already given a use-case that can appear even in big Perl applications, where I think %_ has its place - let me elaborate on that:
    When you have, say, a module which offers functionality for which it takes a whole set of related callback functions, which when called should get access to a set of variables (but as the module author you can't predict which variables will actually be used by which user callback), then passing a hash makes sense. Now if, in the user code, the callbacks turn out to be very simple and short (say, <1 line each), then copy&pasting the same boilerplate between all of them for unpacking the parameters into a hash just adds noise - "less sexy" in this case also means "less readable".


    Besides @_, the use of %_ as I imagine it can be compared to the use of $_ in foreach loops.
    The "safe", "clean", "text-book" way is to declare an explicit loop variable, and in most cases that does indeed makes sense:

    foreach my $person (@people) { ## ... ## Long and complex code that uses $person many times, might ## potentially be refactored into nested loops in the future, etc. ## ... }

    Using $_ instead of $person in that example would entail many of the same or similar disadvantages/problems as the ones that you and other commenters have invoked against the use of %_ in subs.

    And yet, Perl does make $_ available as a built-in language feature.
    And I, for one, happily use it to make code shorter and cleaner in those cases where it doesn't hurt, e.g.:

    say $_->{name} foreach @people;
    $age_sum += $_->{age} foreach @people;

    Perl trusts me to make an informed decision when to use $_ and when not. The fact that it can be used in ways that make code unreadable or buggy, wasn't deemed enough of a reason to exclude it from the language.

    Why not apply the same principle to %_?

      Why not apply the same principle to %_?

      Because $_ costs (almost) nothing to provide; so people not using it do not pay a penalty for those that do.

      Setting up a hash from an arbitrary list of value would require substantial extra validation to see if doing so made any sense -- are there an even number; does the first of each pair make sense as a key etc. -- then building the hash and aliasing the hash values to the input arguments etc.

      That is a substantial penalty for everyone who doesn't used that hash to pay, for the convenience of the few that would.

      If you want that facility, do it yourself, cos I have no use for it and I don't want to pay the penalty.

      With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.

        If it would make all function calls slower, then I agree with you that it wouldn't be worth it.

        But does that really have to be the case?
        choroba suggested above that the hash could be built when first used... Would that not be technically feasible?

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://1055925]
and all is quiet...

How do I use this? | Other CB clients
Other Users?
Others pondering the Monastery: (4)
As of 2017-02-21 08:59 GMT
Find Nodes?
    Voting Booth?
    Before electricity was invented, what was the Electric Eel called?

    Results (309 votes). Check out past polls.