Beefy Boxes and Bandwidth Generously Provided by pair Networks
Keep It Simple, Stupid
 
PerlMonks  

sort +*, @array

by raiph (Deacon)
on Dec 09, 2013 at 00:18 UTC ( [id://1066239]=perlmeditation: print w/replies, xml ) Need Help??

»»» This post is about the immature Perl 6, not the rock solid Perl 5 «««

I often share small things about P6. This is perhaps the smallest yet...

So I just encountered this series of recent blog posts by someone exploring P6. Anyhoo, this meditation isn't about what he's written but the P6 feature brought to my attention by a comment added to his 5th post which shows something like this:

say sort +*, @array;

Apparently this is akin to a schwartzian transform (which a 4 year old PM post by Moritz suggests is implemented more efficiently than an actual schwartzian transform).

Update At least two monks failed to see a strong (or any) connection between say sort +*, @array and the ST. In summary the * is a single arg "key extractor" closure, and the P6 sort builtin automatically turns @sorted = sort +*, @array in to something like the P5 code:

@sorted = map $_->[0], sort { $a->[1] <=> $b->[1] } map [ $_, +$_ ], @array;

/Update

Perhaps everyone else already knew this, but I didn't and, weeell... I think it's amazingly short and it came directly on the heels of another tiny triumph of P6 brevity.

Of course, the main point of the transform is speed, and as the author of the quoted blog mentions in one of his posts, P6 is still slooow (even if it has gotten something like an order of magnitude faster this year for many tasks). But a secondary point of the transform is that it's a nice idiom and I think this is another example of the wide array of nice P6 idioms.

Replies are listed 'Best First'.
Re: sort +*, @array
by BrowserUk (Patriarch) on Dec 09, 2013 at 02:36 UTC
    Apparently this is akin to a schwartzian transform ...

    How can that possibly be?

    Given that the ST allows you to sort the data by a user defined piece of secondary data, and a user-defined comparison strategy -- and that secondary data needn't even be derived from the actual data in anyway -- and that construct clearly has no mechanism for providing either.

    For example, how could that construct do this:

    my $today = time; $today -= $today % 86400 + 43200;; ## midday my @dates = map $today + 86400 * $_, 0 .. 364;; my @ordered = map $_->[0], sort { $a->[1] cmp $b->[1] || $b->[2] cmp $a->[2] } map[ $_, substr(scalar localtime($_),0,3), substr(scalar localtime($_), +5,3) ], @dates;;

    I can't think of any good reason to sort the epoch dates of the next year of days, by the spelling of their days ascending and the spelling of their months descending, but if I wanted to, the ST allows me to do so.

    But there is no way that construct will.


    With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.
    /div

      I concur.

      TIMOTIMO wrote:

      ... and then sorts the original list based on the corresponding result list. In perl5, this was known as the 'schwartzian transformation'.
      That is a completely inaccurate characterization of the technique known as the ST. Yes, that's one thing that could be done with the ST, but the ST itself was far more general, and by definition must be a functional pipeline of the form map-sort-map.
      But there are also other — better — ways of "sorting an original list based on the corresponding result list." The brilliance of the ST is its conciseness, not its speed.

      I reckon we are the only monastery ever to have a dungeon stuffed with 16,000 zombies.
      I think "akin to ST" is reasonable.

      It's not the decorate-sort-undecorate of a by-the-book ST, but it's a direct equivalent which retains the ST's generality and efficiency and substantially improves on its elegance.

      To understand how the P6 design retains full ST capability, follow my attempt to P6ify your example thru two versions, the first not ST equivalent, the second equivalent:

      enum days <mo tu we th fr sa su>; enum months <jan feb mar apr may jun jul aug sep oct nov dec>; my $noon = DateTime.new(now) .truncated-to(day) .delta(12, hours); my @dates = $noon, *.delta(1, day), ... *; my @ordered = sort { ~days($^a.day-of-week - 1) cmp ~days($^b.day-of-week - 1) || ~months($^b.month - 1) cmp ~months($^a.month - 1) } ], @dates[^365];

      Some may quibble with the jump from the +* construct to a { ... } closure. But +* in term position is just sugar for a (one arg) closure (which as an arg to sort is assumed to be a "key extractor" closure).

      Some may quibble with the jump from a one arg "key extractor" closure to a two arg P5 style "comparator" closure. But P6 uses the one arg closure to do the same thing for both args ($a and $b in P5) in an automatically constructed comparator closure, and assumes use of cmp for comparison. So, no need for an explicit two arg "comparator" closure unless you really want one.

      Update I had intended that folk would read this comment as a whole, including what comes next, but I think some stopped at the above. The above works in current Rakudo, reads pretty cleanly, and is fully general functionally, but is not ST equivalent in at least the sense that it doesn't cache the keys. The following reads cleanly, does cache the keys, and is fully general -- but does not work in current Rakudo.

      enum days <<:mo(1) tu we th fr sa su>>; enum months <<:jan(1) feb mar apr may jun jul aug sep oct nov dec>>; my $noon = DateTime.new(now).truncated-to(day).delta(12, hours); my @dates = $noon, *.delta(1, day), ... *; my @ordered = sort [ { ~days(.day-of-week) }, { ~months(.month) } is descending ], @dates[^365];

      Update Let's add some commentary to try make the ST equivalency pop...

      Those two closures ({ ~days(.day-of-week) }, { ~months(.month) }) are key extractor closures. They correspond to the two calls in the P5 original:

      substr(scalar localtime($_),0,3), substr(scalar localtime($_),+5,3)

      Just as one can have arbitrary before maps with P5 ST, the P6 sort builtin design supports any number of arbitrary key extractor closures. P6 then automatically generates two comparator closures, and runs the first comparison to see if it sorts between a given pair of items, and if it doesn't, runs the next comparison. This corresponds to the P5:

      $a->[1] cmp $b->[1] || $b->[2] cmp $a->[2]

      (The descending aspect of the second cmp expression in the P5 comparator closure is covered by the is descending trait on the second closure in the P6 code.)

        but it's a direct equivalent

        By that logic, every sort could be called a ST. That just doesn't make sense. Use the term as it is defined and well known: decorate-sort-undecorate!

        retains the ST's generality

        I don't see how it does that at all. Unless it can do decorate-sort-undecorate, it falls very far short of the ST's generality.

        And it's not just about the "decorate" step, which your "key extractor" closure is roughly equivalent of. Because the ST has an explicit sort block (normally, unless it's the GRT variant :-)), it allows to execute arbitrarily complex Perl code in the comparison. Where does one do that in this Perl 6 "equivalent"?

        I reckon we are the only monastery ever to have a dungeon stuffed with 16,000 zombies.

        That's not an ST, it's at best an Orcish Maneuver. (Or simply shorthand for {$a<=>$b}.)

        Some may quibble ...

        And some may simply stop reading misinformation...


        With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        "Science is about questioning the status quo. Questioning authority".
        In the absence of evidence, opinion is indistinguishable from prejudice.
Re: sort +*, @array
by hdb (Monsignor) on Dec 09, 2013 at 10:09 UTC

    And what does say sort +*, @array; actually do?

        Thanks for your reply.

        So it is the Perl 6 equivalent to say sort {$a <=> $b} @array;?

        But how is that statement related to the Schwartzian transformation?

Re: sort +*, @array
by sundialsvc4 (Abbot) on Dec 10, 2013 at 16:21 UTC

    “Brevity” is not necessarily a virtue.   The comment above, “what does this piece of code actually do...?” is actually very telling, IMHO.

    #define SOAPBOX

    Sure, “clever” is entertaining, as “golf” is entertaining, but I frankly hate to see code that does things such as stringing-together map and grep calls.   Congratulating themselves for “cleverly” do what a nearly-equally short but easily maintainable and clear “conventional” (how boring ...) subroutine could have done just as well.   I hate these things when, and because, I have to change them for some reason.   Then ... well ... have you ever taken apart a Rubik’s Cube?

    (To clarify the engineering reason for my beef here ...)   A source-code change that is made to a “compact, elegant, clever” piece of source-code, is of necessity a highly pervasive one, affecting much more than “the minimum change necessary to implement this particular feature or to correct this particular problem.”   Because the whole thing is a Gordian Knot, the change forces itself not only upon the particular issue being addressed, but also every-other piece of logic that runs through this area ... because, likely as not (knot?), the whole thing must be torn-apart and rewritten.   From an engineering point-of-view, this is a serious loss of maintainability – and for no good reason.   Hence, a thing to be avoided, whenever possible, even at the cost of a few microseconds, because human time is really what costs money ... and, business risk.   Please give me code that “obviously” does what it’s supposed to (or that equally “obviously” doesn’t).   And, give me code that I can very easily and “obviously” change, even after its author is long gone, using one very small, very targeted suture.   Don’t oblige me to rig up a heart-lung machine.

    Give me clear, every time.   Nothing else.   If it “runs too slow,” I’ll spend $10,000 USD on a mother-of-god CPU to run it with, and it will cost me much less money to do that, than it would to tear-apart and build-up-again “clever” source code written by some long-gone someone who is nowhere to be found (or dead).   Please design your code, not for the “efficiencies” of the moment, but for maintainability aimed squarely at the many decades to come.

    #undef SOAPBOX

      I agree with the sentiment that clarity, in most scenarios, is far more important than brevity. But you surely aren't suggesting that code should be so obvious that even someone who doesn't know a given language or dialect automatically knows exactly what code in that language or dialect means?

      That said, P5ers ought to be most of the way there:

      • The sort is a (builtin) function call.
      • The prefix + coerces to numeric.
      • The * is a Whatever. (A P6 innovation, explained below.)
      • The , is the comma operator. (P6 has cleaned up the P5 warts. The comma operator never returns the last value; even in item context it returns a list, as most people would expect.)
      • @array is an array variable.

      All of this is absolutely obvious to even a beginner P6er.

      Many P5ers won't know what a Whatever is, and won't know the way the sort builtin interprets closure arguments, so I'll explain those:

      • The Whatever is as pervasive and basic in P6 as the $_ ("it") variable is in both Perls. In many scenarios, such as this one, a single Whatever means { $_ }. Add in the prefix plus and you have sort { +$_ }, @array.
      • The sort builtin interprets a single arg closure as its first arg as a "key extractor", which is the clean P6 abstraction of what the map at the start of a P5 ST is comparatively crudely abstracting.

        This is a useful explanation, thanks.

        However, it seems to me one of the issues "P5ers" have with P6 is, that they can never rely on symbols in P6 having the same meaning as in P5. You give the comma operator as one example and I am using it a lot to retrieve the last element of a list.

        So the P5ers might be there most of the way but they would not know for certain...

        A reply falls below the community's threshold of quality. You may see it by logging in.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlmeditation [id://1066239]
Approved by GrandFather
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others examining the Monastery: (5)
As of 2024-04-16 12:17 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found