Beefy Boxes and Bandwidth Generously Provided by pair Networks
P is for Practical
 
PerlMonks  

Advanced Sorting - GRT - Guttman Rosler Transform

by demerphq (Chancellor)
on Feb 15, 2002 at 10:47 UTC ( #145659=perlmeditation: print w/ replies, xml ) Need Help??

Many people are familiar with the advanced sorting technique called the Schwartzian Transform also referred to as the ST which is of course named after our very own merlyn.

The basic idea of this technique is to speed up sorts where complex functions are required to be called to put the data into order by precomputing the various keys that will be used. The precomputed keys, and the actual item is usually wrapped in an anonymous array (or anonymous hash, basically any kind of transitory container) through a map, the output of which is then sorted based on the contents of the various keys, and then fed into another map that unwraps the payload and returns the original values, but in the correct order. This process can result in signifigant time savings. An example of the technique is as follows. Lets say we have a list of words, and we want to sort them by the number of 'e's present in each word, then by the actual word. Using the ST we would do something like this:

my @words=qw(The time has come the Walrus said to speak of many things +); my @sorted=map { pop @$_ } sort{ $a->[0] <=> $b->[0] || $a->[1] cmp $b->[1] } map { [tr/eE/eE/,$_] } @words;
The Guttman Rosler Transform or GRT is a variant (refinement) of this technique that attempts to avoid using a perl inorder function for the sort (as it is perl code is slower than the internal lexicographical or numeric sorting which is implemented in C) and to avoid the overhead of creating many (potentially thousands) of anonymous arrays (or containers, they need not be arrays after all).

The basic idea is to encode the precomputed keys into a string, usually using pack() such that perl can sort them lexicographically (also called ascii sort or char sort). Thus the above sort can be transformed into

my @sorted=map { substr($_,4) } sort map { pack("LA*",tr/eE/eE/,$_) } @words;

Generally speaking the GRT is a signifigant improvement on the ST, although it can be a bit tricky to work out the appropriate pack/unpack functions. (Unless you're tye of course ;-)

Anyway, enough of my explanations, anybody who wishes to learn more about this technique and other advanced sorting techniques should get it from the horses mouth and have a look at the excellent article A Fresh Look at Efficient Perl Sorting

Recent Update: tye has come up with two excellent nodes that cover both the ST and GRT and some practical variants. Check em out: Re^2: Stable sorting in Perl (old) and fast, flexible, stable sort
Updates: fixed typos, added clarification :-)

Yves / DeMerphq
--
When to use Prototypes?

Comment on Advanced Sorting - GRT - Guttman Rosler Transform
Select or Download Code
Re: Advanced Sorting - GRT - Guttman Rosler Transform
by dws (Chancellor) on Feb 15, 2002 at 21:14 UTC
    The basic idea is to encode the precomputed keys into a string, usually using pack() such that perl can sort them lexicographically

    This is a key point. I've seen it overlooked in two cases, both leading to subtle sorting problems of the "well, this stuff sorts most of the time" variety.

    A naive read of Guttman-Rosler is "hey, let's cram the key parts together in a way we can uncram them later." That's necessary, but not sufficient. Simple concatenation won't do. You must pack(), or concatenate in such a way that the keys are aligned for a lexicographic sort. If you see someone use join in a Guttman-Rosler search, watch out!

    Consider the difference between

    Naive join Aligned keys foo 47 foo 47 foo 103 foo 103
    If you don't align the keys (typically via pack or a carefully crafted sprintf), you risk getting composite keys that sort correctly most of the time. A variant of the problem can occur with non-numeric strings when joining them with a character like '|'.

Re: Advanced Sorting - GRT - Guttman Rosler Transform
by dmmiller2k (Chaplain) on Feb 18, 2002 at 19:14 UTC

    Wow. I use this trick all the time; I am happy to finally learn it has a name.

    I came up with (so I thought) the idea of packing the array elements into decodable strings, sorting those, then decoding them, in the midst of trying to optimize a particulary hairy Sybase query. I was able to get rid of all the table scans (non-indexed searches) except the last one (the ORDER BY clause). I started experimenting with sorting in Perl and then got sucked into trying to speed it up.

    I suppose I've always thought of it as a variation on the ST: precalculating the sort keys using map, sorting based upon those keys, and finally, another map to extract the actual data again. I never realized that the ST specified explicitly creating anonymous arrays.

    Thanks (and ++), demerphq, for giving this a name.

    Update: I forgot to add that, I've been using sprintf(), rather than pack(), which would obviously be even faster (did I say, "Thanks," demerphq?), but in my case, there were enough rows that the key generation phase was more-or-less insignificant compared with the actual sorting process. The hardest part is inverting the portions of the key which must sort descending.

    In other words, if using dws's example, one had to sort ascending by the first column ('foo' in the example), then descending by the second column (47 and 103, respectively), you need to invert the second column by using 10's complement for the number of significant digits you have. E.g., if the name column has up to 10 characters, and the value column can go up to 10000 (four digits),

    my @sorted = map { [substr($_, 0, 10), 10000-substr($_, 10, 4)] } sort map { sprintf("%10.10s%04d", $_->[0], 10000-$_->[1]) @uns +orted;

    dmm

    If you GIVE a man a fish you feed him for a day
    But,
    TEACH him to fish and you feed him for a lifetime
      I suppose I've always thought of it as a variation on the ST: precalculating the sort keys using map, sorting based upon those keys, and finally, another map to extract the actual data again. I never realized that the ST specified explicitly creating anonymous arrays.
      And I agree with you, but I'm not willing to fight for that belief being universal (that the GRT is really just a specialization of the ST). I've got other issues to fight.

      In case you were curious. {grin}

      Another way to look at it is that I made "map-sort-map" popular in the Perl community, by issuing one specific use of it which some have come to know as the ST. The GRT gang came up with another use that is harder to generalize, but can produce better results for a subrange of problems. I don't believe that the GRT can be sufficiently generalized in any practical sense to cover all problems, hence knowing both manifest map-sort-map strategies (and inventing others) is probably the best meta-strategy.

      -- Randal L. Schwartz, Perl hacker

        And I agree with you, but I'm not willing to fight for that belief being universal (that the GRT is really just a specialization of the ST).

        Actually, if I've presented the GRT as anything other than a refinement of the Schwartzian Transform then I've communicated myself poorly.

        The whole reason I started by discussing the Schwartzian Transform in my original post was to explain GRT in the context of the Schwartzian Transform.

        Personally I would call anything along these lines a Schwartzian Transform, but when it was warranted would instead use the more specific term GRT, but this is not in the slightest meant to imply that a GRT isn't a Schwartzian Transform. I suppose its like the square/rectangle idea. All GRTs are Schwartzian Transforms but not all Schwartzian Transforms are GRTs.

        I would say that in the texts I've read (and your own snippet here in the monastery) Schwartzian Transforms are characterized by using an inorder function that accesses part of a wrapper containing the precomputed keys, whereas GRTs are characterized by not having an explicit inorder function at all.

        I don't believe that the GRT can be sufficiently generalized in any practical sense to cover all problems, hence knowing both manifest map-sort-map strategies (and inventing others) is probably the best meta-strategy.

        I couldnt agree more. (And stated something like this at the end of my post, although not in so many words)

        Oh and thank you for introducing such an interesting technique to me and the community as a whole. I use it all the time.

        PS - It would be really cool if you retitled your snippet Schwartzian Transform to "Schwartzian Transform (ST)" so that we could link to it via [ST] ;-) Or even write a new node explaining it a bit, you know straight from the horses mouth so to speak. But perhaps you dont have enough time?

        Yves / DeMerphq
        --
        When to use Prototypes?

        I don't believe that the GRT can be sufficiently generalized in any practical sense to cover all problems
        Given a function to convert floats to lexicographically ordered strings, you should be able do a GRT on any series of || linked cmp and <=> operator operators. Are there any antisymmetric, transitive, total relations that can't be written as a series of || linked cmp and <=> operators?
Re: Advanced Sorting - GRT - Guttman Rosler Transform
by dkr (Initiate) on Apr 22, 2003 at 00:54 UTC
    There is a small bug in the example of a Schwartzian Transform.
    $a->[0] <=> $a->[0]
    should be
    $a->[0] <=> $b->[0]
    Granted, you did say 'something like this', 8^)
Re: Advanced Sorting - GRT - Guttman Rosler Transform
by Juerd (Abbot) on Aug 10, 2003 at 10:07 UTC

    Generally speaking the GRT is a signifigant improvement on the ST

    But it doesn't seem te be an improvement with your example sort:

    Benchmark: running bare, grt, st for at least 3 CPU seconds... bare: 6 wallclock secs ( 3.08 usr + 0.01 sys = 3.09 CPU) @ 12 +79584.47/s (n=3953916) grt: 8 wallclock secs ( 3.11 usr + 0.00 sys = 3.11 CPU) @ 11 +18145.66/s (n=3477433) st: 7 wallclock secs ( 3.05 usr + 0.01 sys = 3.06 CPU) @ 11 +46781.05/s (n=3509150) Rate grt st bare grt 1118146/s -- -2% -13% st 1146781/s 3% -- -10% bare 1279584/s 14% 12% --

    Juerd # { site => 'juerd.nl', plp_site => 'plp.juerd.nl', do_not_use => 'spamtrap' }

      Your benchmark is utterly bogus.

      First, you use the q{} form of benchmark and yet you try to access a lexical variable that will not be in scope when the code is evaled. So your code benchmarks sorting an empty list! (The counts in the millions per second range should have tipped you off that something was wrong.)

      Next the 'st' implementation has a bug in it so it doesnt actually sort correctly $a->[0] <=> $a->[0] should read $a->[0] <=> $b->[0], and the 'bare' implementation doesnt do the full sort either as it doesnt sort based on the word as well as the 'E' count.

      Once the benchmark is fixed up to actually test similar things against similar things, and to use a better non empty data set we see that the GRT crushes the competition, and that the ST greatly outperforms the uncached code.

      Benchmark: running bare, grt, st, each for at least 3 CPU seconds... bare: 3 wclk secs ( 3.02 usr + 0.00 sys = 3.02 CPU) @ 4.63/s + (n=14) grt: 4 wclk secs ( 3.15 usr + 0.00 sys = 3.15 CPU) @ 15.54/s + (n=49) st: 3 wclk secs ( 3.13 usr + 0.00 sys = 3.13 CPU) @ 10.21/s + (n=32) Rate bare st grt bare 4.63/s -- -55% -70% st 10.2/s 120% -- -34% grt 15.5/s 235% 52% --

      And using the exact same data set as you did we see that the GRT still wins, even if 'bare' outperforms 'st'.

      Rate st bare grt st 18009/s -- -21% -49% bare 22661/s 26% -- -36% grt 35616/s 98% 57% --

      The moral of the story is a common one: Always test your benchmarks. Always double check that what you are comparing is equivelent. Always be careful with selecting the data set you benchmark against. If you benchmark a code snippet and not a subref then you should excercise even more caution. (Generally I dont think its a good idea actually.)


      ---
      demerphq

      <Elian> And I do take a kind of perverse pleasure in having an OO assembly language...

        So your code benchmarks sorting an empty list!

        Oops.

        Next the 'st' implementation has a bug in it so it doesnt actually sort correctly $a->[0] <=> $a->[0] should read $a->[0] <=> $b->[0]

        Copied that from the root node. Wrong indeed.

        grt 15.5/s 235% 52% --

        That's very comforting. Thanks.

        Juerd # { site => 'juerd.nl', plp_site => 'plp.juerd.nl', do_not_use => 'spamtrap' }

        Be careful which Perl version you are using. I ran the (corrected) benchmark under perl v5.6.1 and got this:
        Rate st grt bare bare 4.55/s -- -51% -66% st 9.32/s 105% -- -31% grt 13.5/s 196% 44% --
        But under perl v5.8.0 I got this:
        Rate st grt bare st 9.43/s -- -15% -37% grt 11.1/s 18% -- -26% bare 15.0/s 59% 35% --
        which seems to negate the use of GRT or ST sorting.
Re: Advanced Sorting - GRT - Guttman Rosler Transform
by eyepopslikeamosquito (Canon) on Aug 25, 2003 at 12:26 UTC

    Your example code:

    my @sorted=map { substr($_,4) } sort map { pack("LA*",tr/eE/eE/,$_) } @words;
    is not quite correct; the "LA*" should be "NA*". The "L" signifies a native 32-bit unsigned integer yet to guarantee sorting correctness on all architectures you must specify "N", network (big-endian) order.

      Ah, good catch. On my architecture it would sort correctly, but not on all. This is one of the reasons this technique can be a bit tricky, but getting it right can have a nice pay off. Cheers.


      ---
      demerphq

      <Elian> And I do take a kind of perverse pleasure in having an OO assembly language...
Re: Advanced Sorting - GRT - Guttman Rosler Transform
by planetscape (Canon) on Sep 15, 2005 at 02:49 UTC

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlmeditation [id://145659]
Approved by root
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others having an uproarious good time at the Monastery: (7)
As of 2014-08-30 04:18 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    The best computer themed movie is:











    Results (291 votes), past polls