Beefy Boxes and Bandwidth Generously Provided by pair Networks Bob
Perl Monk, Perl Meditation
 
PerlMonks  

Re^4: Benchmark, -s versus schwartzian

by merlyn (Sage)
on Aug 23, 2004 at 15:48 UTC ( #385120=note: print w/ replies, xml ) Need Help??


in reply to Re^3: Benchmark, -s versus schwartzian
in thread Benchmark, -s versus schwartzian

By "general", I mean "mortal people can work out a solution for all complex sort criterion". You have to be a frickin' genius to work out a single string for the GRT sometimes. The ST in comparison is always straightforward: construct a list of your sortable keys for a given input item.

-- Randal L. Schwartz, Perl hacker
Be sure to read my standard disclaimer if this is a reply.


Comment on Re^4: Benchmark, -s versus schwartzian
Re^5: Benchmark, -s versus schwartzian
by ambrus (Abbot) on Aug 23, 2004 at 19:20 UTC

    Why? You can always do the GRT by using only the key (by which you sort) and an index in the string. I think of something like:

    @files = glob("/bin/*"); @sorted = @files[ map { /(\d+)$/g } sort map { sprintf "%012d %d", -s $files[$_], $_ } 0..@files-1]; print "@sorted$/";

    This is of course efficent only because of Perl optimizing the default sort subroutine. Someone who understands perl source well might be able to change sort so that it would optimize sort methods like {$h[$a]<=>$h[$b]} which would give us a possibility to sort even faster like

    my @h = map -s, @files; @results = @files[sort {$h[$a]<=>$h[$b]} 0..@h-1];
      You can always do the GRT by using only the key...
      Uh, that's sometimes the hard part. Suppose you wanted to sort by a floating point number (say, age in days), then descending by a string of varying length that might have a NUL in it, and then descending by another integer (byte size). Quick, construct the GRT for that. Not easy, huh. Trivial in the ST.

      Yes, you can always get a single GRT string for a multilevel sort. But sometimes, as I said, it takes a frickin' genius.

      -- Randal L. Schwartz, Perl hacker
      Be sure to read my standard disclaimer if this is a reply.

        If sort {$a<=>$b} is indeed optimized (it seems so), you can just use it for a floating-point key, you just have to 'no warnings "numeric"; and put the float in the first place.

        You are, however, right in that if the key is more complicated, this method is not applicable. (You can however gain with the ST only if calculating the key is much slower than comparing them.)

        You know, all this fancy fast sorting would be much, much easier if perl's sort function compared references to arrays the way python's sort function does: that is, lexographically. (aka "dictionary order") Then, the standard Schwartzian transform could omit the { $a->[0] cmp $b->[0] or $a->[1] cmp $b->[1] } bit, and the performance differences between the ST and GRT would almost vanish. This would allow us to stop trying to construct a GRT for squeezing that extra 2% out of a long sort time and get on with our lives.

        Actually, is there a good reason why the builtin sort compare doesn't compare references to arrays in this fashion?

        -- @/=map{[/./g]}qw/.h_nJ Xapou cets krht ele_ r_ra/; map{y/X_/\n /;print}map{pop@$_}@/for@/

      I rarely find the GRT worth the bother. It is ridiculously hard to maintain GRT code, and it's not unlikely for the transformation step in a GRT to be so computionally complex that the order-by-sorted-indices method as per your second snippet beats it anyway — as seen at Re: To use a module...or not.. Order-by-sorted-indices is generally straightforward, easy to maintain, and easy to get consistent results from; yet very fast and memory efficient. I doubt I'll ever need anything else (though it's not completely impossible, of course).

      I appreciate the ST as an excercise in functional thinking, but it's been a long time since I last used it in practice.

      Makeshifts last the longest.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://385120]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others romping around the Monastery: (5)
As of 2014-04-20 14:31 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    April first is:







    Results (485 votes), past polls