Beefy Boxes and Bandwidth Generously Provided by pair Networks
Keep It Simple, Stupid
 
PerlMonks  

Perl style: Arguing against three-argument join()

by martin (Pilgrim)
on Jan 24, 2008 at 16:44 UTC ( #664072=perlmeditation: print w/ replies, xml ) Need Help??

Esteemed fellow monks,

Reviewing some of my Perl code, it just occured to me that I could "optimize" this expression:

join '', $config{'prefix'}, $item, $config{'suffix'}
...into something shorter, like this:
join $item, @config{'prefix', 'suffix'}

I still prefer the first version, however, as the second one tends to obfuscate the order of parts in the final result. It could inconvenience extension, too.

Of course, simple concatenation would be even more straightforward:

$config{'prefix'} . $item . $config{'suffix'}
But I have no strong opinion there, since the join-tightly idiom seems commonplace enough to be recognized as equivalent. What do you think?

-martin

Comment on Perl style: Arguing against three-argument join()
Select or Download Code
Re: Perl style: Arguing against three-argument join()
by kyle (Abbot) on Jan 24, 2008 at 16:55 UTC

    In the case you show, I think I prefer concatenation, but I have no problem with the original join. The join with the slice stinks because I think it obscures the intent. If it had been...

    join $separator, @config{ 'item1', 'item2' };

    ...then I'd have thought that was the best of the three.

Re: Perl style: Arguing against three-argument join()
by dragonchild (Archbishop) on Jan 24, 2008 at 16:56 UTC
    It all depends on the meaning of $item. Is $item a separator or is it a real thing in and of itself? If it's a real thing, it should be off to the right. The second version makes $item into a separator. This implies that I could add things to the list and it would make sense.

    And, frankly, unless I had an array, I would use concatenation. But that's just me.


    My criteria for good software:
    1. Does it work?
    2. Can someone else come in, make a change, and be reasonably certain no bugs were introduced?
Re: Perl style: Arguing against three-argument join()
by toolic (Chancellor) on Jan 24, 2008 at 16:57 UTC
    Since you are soliciting opinions, I think your first version is an unusual usage of join. I think the most strightforward is your 3rd version (simple concatenation). And I do not think of join as being a three-argument function, as your title suggests. I think of is as two-agrument: an expression for the separator and a list.
      And I do not think of join as being a three-argument function, as your title suggests. I think of is as two-argument: an expression for the separator and a list.

      toolic, you are right, I probably should have said "three-argument usage of join" or "pseudo-separator, two-element list usage of join".

      On second thought, I might settle now for "abusing middle element of three-element list as join() separator" or something. If we take this much further we might be able to squeeze the whole discussion into a headline...

Re: Perl style: Arguing against three-argument join()
by chromatic (Archbishop) on Jan 24, 2008 at 18:31 UTC

    How does your second snippet "obfuscate the order of parts in the final result"?

      Maybe if one doesn't know Perl well enough to simply use concatenation to join two items with a separator one is bound to get confuzzled about the order things come out of a join . . . :)

      Update: Just to clarify, that's a big smiley there. I don't think the join form is unqualified blasphemy dooming the author to an eternity of maintaining phpbb mods, but it's not the first choice I'd use unless it was relatively certain that the number of things needing to be concatenated was either arbitrary or varying (e.g. you need to conditionally intersperse $config{'infix'} every fourth Thursday in months not containing an "r"). qq{$config{'prefix'}$sep$config{'suffix'}} reads just fine to me, as does the OP's concatenation version.

      The cake is a lie.
      The cake is a lie.
      The cake is a lie.

      How does your second snippet "obfuscate the order of parts in the final result"?

      The three strings to be concatenated are given in a different order than from left to right, which might be counter-intuitive for a casual reader.

        That's silly. Someone who doesn't know what join does or the arguments it takes--or who doesn't know how to look it up in a reference or in the Perl documentation--has no business maintaining a serious Perl program. While there are definitely Perl operators with poor interfaces (caller is one of my least favorite), avoiding the use of join and slices because people who don't know Perl might not understand it is silly.

        (I realize you said "casual reader", which implies someone not reading carefully, but a similar argument applies there.)

        The three strings to be concatenated are given in a different order than from left to right, which might be counter-intuitive for a casual reader.

        Fat arrow to the rescue?

        join $separator => @things{'this', 'that', 'the other'};

        However, where the task at hand is better described as prepending and appending to a string, rather than sewing together a list of arbitrary length, I concur with some others in this topic that concatenation is clearer.

        -Mike

        As a casual reader, albeit one that has used, and is used, to the join function, I had to go over the OP twice to understand what is said in the first option (join '',a,b,c). I can't really relate to using a function in a way that is counter intuitive to someone who is familiar with the way it should be used, but I'd hazard that for anyone who never used it, or Perl, neither will make sense; and if that person is to perldoc it, the first example might only serve to confuse more.

        Software speaks in tongues of man; I debug, therefore I code.
        Stop saying 'script'. Stop saying 'line-noise'.

Re: Perl style: Arguing against three-argument join()
by eric256 (Parson) on Jan 24, 2008 at 19:44 UTC

    Is $item something like ','? If so your examples beccome

    1) join '', $config{'prefix'}, ',', $config{'suffix'} 1.5) join ',', $config{'prefix'}, $config{'suffix'} 2) join ',', @config{'prefix', 'suffix'} 3) $config{'prefix'} . ',' . $config{'suffix'}

    In which case 2 is the clear winner with 1.5 just behind it. I definitly would never use 1 because there is no point in adding the join '' if you could just change the commas to periods and move on.


    ___________
    Eric Hodges
Re: Perl style: Arguing against three-argument join()
by GrandFather (Cardinal) on Jan 24, 2008 at 20:34 UTC

    Why not just:

    "$config{prefix}$item$config{suffix}";

    Perl is environmentally friendly - it saves trees
      Because to properly handle all cases, that needs to be "$config{prefix}${item}$config{suffix}" and that just looks horrible.

      My criteria for good software:
      1. Does it work?
      2. Can someone else come in, make a change, and be reasonably certain no bugs were introduced?
        Huh?
Re: Perl style: Arguing against three-argument join()
by jdporter (Canon) on Jan 24, 2008 at 21:12 UTC

    feh.

    do { local $" = $item; qq{@config{qw{prefix suffix}}}};
    A word spoken in Mind will reach its own level, in the objective world, by its own weight
Re: Perl style: Arguing against three-argument join() (++join '')
by tye (Cardinal) on Jan 24, 2008 at 22:08 UTC

    Having worked on a lot of code that does a ton of concatenating of strings (hint: you're soaking in it), I've come to dislike concatenation and much prefer join '', ... or even join "\n", ....

    You start out with "you have $count left" and then some refactoring leads to $count being replaced by $new+$old and you have too much sense to do something bizarre like "you have ${\($new+$old)} left" (which doesn't scale well anyway when the strings and expressions grow to the point that more than one line is called for) so the next obvious choice is "you have " . $new+$old . " left" which is, unfortunately, quite wrong. Fixing it starts to give you a hint of how concatenation can suck, "you have " . ($new+$old) . " left". Then somebody makes things fancier and the concatenation needs to be spread over more than one line.

    After a lot of experience writing and updating such code, I've come to prefer the following (after trying a lot of alternatives):

    $html .= join '', createLink( $USER, "you" ), " have ", $new + $old, " message", 1 == $new+$old ? "" : "s", " left";

    So I tend to start out at "you have $count left" then jump next to join '', "you have ", $new+$old, " left".

    I've also come to dislike sprintf for such concatenations. I don't like the out-of-order nature between non-format-specifier parts of the first argument and the other items being concatenated into the result. And I don't like the separation between the format specs and the values being formatted. I've also seen too many mistakes like sprintf "bleh $foo%s bar", complex($expr) where $foo might contain a % character. In most cases I find that the majority of the parts are simply being concatenated so if any of the items need 'printf' formatting, then I use sprintf to format just that item:

    $html .= join '', createLink( $USER, "you" ), " have a ", sprintf( "%+.2f", $rating ), " rating";

    Which might lead to the suggestion of here-docs, which I really dislike since they can't be indented and can fail very confusingly in the face of invisible trailing whitespace.

    I hope that leads to some to think of suggesting a "real" templating system. That's often an excellent suggestion. Of course it isn't an appropriate replacement for tons of cases of concatenation. (:

    - tye        

      You start out with "you have $count left" and then some refactoring leads to $count being replaced by $new+$old and you have too much sense to do something bizarre like "you have ${\($new+$old)} left" (which doesn't scale well anyway when the strings and expressions grow to the point that more than one line is called for) so the next obvious choice is "you have " . $new+$old . " left" which is, unfortunately, quite wrong. Fixing it starts to give you a hint of how concatenation can suck, "you have " . ($new+$old) . " left".
      Lessee, in Perl 6 we fixed the precedence of concatenation with respect to + (at your suggestion), so that part's okay...

      If you want to keep your current style you can write:

      $html ~= [~] createLink( $USER, "you" ), " have ", $new + $old, " message", 1 == $new+$old ?? "" !! "s", " left";
      but you can also say:
      $html ~= createLink( $USER, "you" ) ~ " have " ~ $new + $old ~ " message" ~ (1 == $new+$old ? "" : "s") ~ " left";
      or you can interpolate with closures:
      $html ~= "{createLink $USER, "you"} have { $new + $old } message{" +s" x ($new+$old == 1)} left";
      or spread it out blockishly:
      $html ~= "{ createLink $USER, "you" } have { $new + $old } message{ "s" x ($new+$old == 1) } left";
      or you can use methods and subs:
      $html ~= "$USER.createLink("you") have &pl($new+$old,"message") le +ft";
      or some combination of those...
      I've also come to dislike sprintf for such concatenations. I don't like the out-of-order nature between non-format-specifier parts of the first argument and the other items being concatenated into the result. And I don't like the separation between the format specs and the values being formatted. I've also seen too many mistakes like sprintf "bleh $foo%s bar", complex($expr) where $foo might contain a % character. In most cases I find that the majority of the parts are simply being concatenated so if any of the items need 'printf' formatting, then I use sprintf to format just that item:
      $html .= join '',
          createLink( $USER, "you" ),
          " have a ",
          sprintf( "%+.2f", $rating ),
          " rating";
      
      that would now look more like:
      $html ~= [~] createLink( $USER, "you" ), " have a ", $rating.fmt("%+.2f"), " rating";
      or maybe just:
      $html ~= "{createLink $USER, "you"} have a $rating.fmt("%+.2f") ra +ting";
      Which might lead to the suggestion of here-docs, which I really dislike since they can't be indented and can fail very confusingly in the face of invisible trailing whitespace.
      Well, we fixed the indent part at least, though you're still on your own with trailing whitespace.
      I hope that leads to some to think of suggesting a "real" templating system. That's often an excellent suggestion. Of course it isn't an appropriate replacement for tons of cases of concatenation. (:
      Indeed. Though with enough Ways To Do It you almost don't need a templating system... :)
        $html ~= [~] createLink( $USER, "you" ), " have a ", $rating.fmt("%+.2f"), " rating";
        Couldn't that be written as:
        [~=] $html, createLink( $USER, "you" ), " have a ", $rating.fmt("%+.2f +"), " rating";

        My criteria for good software:
        1. Does it work?
        2. Can someone else come in, make a change, and be reasonably certain no bugs were introduced?
        we fixed the precedence of concatenation with respect to + (at your suggestion)

        Thanks! And you're welcome. ;)

        I like your first example. It is nice to have a "concat this list" option so compactly.

        One of the things that I really learned to like about join '', ... is how low the precedence of comma is, so I don't have to put parens around even 'the ternary'. I agree that ?? :: should bind looser than concatenation but that also means I probably won't be doing direct concatenation of long lists of expressions. (Which is not a problem since there are better options.)

        And the rest of the options are lovely.

        Well, we fixed the indent part at least, though you're still on your own with trailing whitespace.

        I'll have to review the lastest proposal on handling of indenting as I wasn't sold on some of the previous versions. But why don't you ignore trailing whitespace after the delimiter? I've decided that it is almost always better to do s/\s+$// over chomp because trailing whitespace is invisible and really deserves to be ignored. Tell me which line has the extra leading space and which one has the extra trailing space:

        LINE LINE LINE LINE LINE

        Sure, you can do it, if you know to look specifically for it, but it just sucks to have things that break due to such. At the very least, Perl 6 should shout "I noticed you had trailing whitespace after what would have otherwise ended your here-doc, but I didn't ignore it and that is probably why I couldn't find the end of your string". (:

        Is there some value to allowing trailing whitespace to make the delimiter not the delimiter? The only value I come up with is a way to include something that looks like the delimiter in the output. I'd much prefer that be done in a much more visible way. Using such a subtle trick deserves a comment pointing out what is going on. But it is worse because (having worked even in a small company where my editor actually shows me trailing whitespace) I know that lots of editing situations don't give trailing whitespace much respect. Lots of things strip it or add it with little justification.

        Surely, several of the above interpolation techniques would be better methods for including a line equal to the delimiter in the here-doc:

        my $string= <<END; You end your here-doc with a line like: {''}END but be sure to prevent any trailing whitespace from sneaking in! END

        What purpose am I missing?

        - tye        

      ++ for the reasoning behind your choice.

      However, reading the code it strikes me there may be a more general approach. Name all final expression values and put them in variables. Then you can simply interpolate them in a qq string and be done with it.

      As a bonus, it's more self documenting.

      Also, from your example above, note how the duplication of $new + $old goes away.

      I rarely generate HTML like this, but perhaps an argument can be made that the method and sub calls above (createLink, sprintf) are special enough to do something clever with the Interpolation module to be able to write something like this:

      my $message_count = $new + $old; $html .= "$createLink{$USER, "you"} have $message_count $plural_of{"me +ssage", $message_count}";

      (Escaping the HTML is left as an exercise for the reader, as is tradition)

      /J

        The repetition of $new+$old was an artifact of quickly fitting several demonstrations in a single example. I wouldn't jump to separating out such simple expressions that are only used once so that you have to use the variable names to mentally map a large number of items being combined to what they were set to. If an expression is more complex than the ones in my example, then introducing a named variable often makes sense.

        But the benefit of a named variable usually depends heavily on the quality of the name. With such short expressions, the name of the variable is often going to be only slightly shorter and can be much less clear. I can put a lot of effort into coming up with names when coding. Great names versus unclear names makes an enormous difference in the clarity of code. So I'm not excited to introduce a new name for something so small as createLink($USER,'you') that is used only once.

        And part of the point I was making with the example was how things scale when the code deserves to be split over multiple lines. I find this makes interpolation break down as far as code readability; that is, I don't like reading several lines of strings each with several named variables interpolated and above this a long list of variables being declared and initialized. For example, it makes it hard to notice that you are doing work to assign a value to a variable that you are actually no longer using below.

        As for Interpolation.pm, you don't show how much code has to be added to create %createLink and %plural_of and then has to be understood. A function or method like createLink() is a general-purpose abstraction while you wouldn't define just %createLink and only use it, avoiding the duplication of making a createLink(). So I've never gone that route. But since interpolating into strings also doesn't scale well for me, that isn't surprising.

        - tye        

      As is traditional, I disagee. I find this far clearer than any other option shown:

      $html .= sprintf '%s have %d message%s left', createLink( $USER, "you" ), $new + $old, $new + $old == 1 ? '' : 's' ;
      1. The general format (and the constant parts) of the message are clearly identifiable on the first line.
      2. All the variable parts are clearly identified.
      3. You have complete control over the formating.
        • Leading zeros. Hex/octal.
        • Graceful transition to sci notation if number get big or small.
        • Left/right justification if needed.
      4. And it's a fully extensible construct that deals nicely with very long string too.
        $html .= sprintf '%s have %d message%s left' . ' occupying 0x%x %s of server space', createLink( $USER, "you" ), $new + $old, $new + $old == 1 ? '' : 's', $space, $units, ;

      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.

        I think the first example you give is quite nice.

        I quite dislike your second example, however. I find it far too difficult to map between the too-many format specifiers spread (rather willy-nilly) over multiple lines and the unnumbered list of expressions below. If the format specifiers had numbers or names, then it would bother me less. I understand your point of disliking having the over-all format broken up.

        I've run into enough cases quite a bit longer than your second example and just don't find that sprintf scales to such situations. It can make quite clear code for the small cases, but I dislike the risk of variables being interpretted directly into the format string (because I've seen it happen so often when sprintf is used) and especially don't like how it doesn't scale. So I quickly jump to the method that scales the best, and also produces quite readable code, IMHO.

        - tye        

        That should be %s have %s message%s left'. Using %d in perl is rarely the right thing to do:
        $ perl -we'printf qq[%d\n], 3e9' 3000000000 $ ssh shaggy "perl -we'printf qq[%d\n], 3e9'" -1294967296
        %d is a C concept, and doesn't map well to Perl's pick-one-of-three possible formats to store a number.

        %e/%f/%g are also not completely problem free, since they could end up losing precision on a 64 bit int/53 bit mantissa perl.

Re: Perl style: Arguing against three-argument join() (++subs and objs)
by dragonchild (Archbishop) on Jan 25, 2008 at 00:08 UTC
    After thinking further, I suspect I would actually prefer the following:
    sub make_expression { my ($item, $config) = @_; return $config->{prefix} . $item . $config->{suffix}; } my $formatted_item = make_expression( $item, \%config );
    The thinking is this:
    • You have configuration mixed with input in the middle of your code. That's similar to mixing presentation and business logic. Don't do that.
    • tye is absolutely correct in that change is the only constant in maintenance. The solution most people use to handle such things is to factor out the piece that can change and give it an unchanging name. That way, you don't care how it happens - it just does.
    • This is a really small step away from creating a presentation object that has, as an initialization parameter, configuration. So, that code then becomes:
      my $formatter = Formatter->new( config => \%config ); # many lines and modules later ... my $formatted_item = $formatter->format( $item );

    My criteria for good software:
    1. Does it work?
    2. Can someone else come in, make a change, and be reasonably certain no bugs were introduced?
Re: Perl style: Arguing against three-argument join()
by sundialsvc4 (Monsignor) on Jan 25, 2008 at 17:15 UTC

    Go for the simple. Make the programmer's intent clear and obvious. Also consider future changes to the code:   try to make the code as maintainable as possible. (After you get smooshed by that bread-truck, no one's gonna want to “dig you up so you can fix it.”)

    “At 2.4 gazillion ops per second, nobody can hear you scream.”

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlmeditation [id://664072]
Approved by kyle
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others drinking their drinks and smoking their pipes about the Monastery: (13)
As of 2014-08-27 22:49 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    The best computer themed movie is:











    Results (253 votes), past polls