Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl-Sensitive Sunglasses
 
PerlMonks  

Re^7: RFC:A brief tutorial on Perl's native sorting facilities.

by Jenda (Abbot)
on Feb 06, 2007 at 22:39 UTC ( #598656=note: print w/ replies, xml ) Need Help??


in reply to Re^6: RFC:A brief tutorial on Perl's native sorting facilities.
in thread RFC:A brief tutorial on Perl's native sorting facilities.

No of course I did not expect it to work. I was showing BrowserUK that it doesn't. I feel stupid for not noticing that I could merge the "sort indices" trick and GRT though to get the best from both worlds. The speed of GRT and the generality of the other styles of complex sort. Silly me. Thanks!


Comment on Re^7: RFC:A brief tutorial on Perl's native sorting facilities.
Re^8: RFC:A brief tutorial on Perl's native sorting facilities.
by BrowserUk (Pope) on Feb 07, 2007 at 00:36 UTC

    I was just about to respond in the same way as jdporter, but I thought I'd run a benchmark first. To my surprise, the key sort is faster than the indexed GRT! Assuming my benchmark isn't screwed:

    #! perl -slw use strict; use Date::Calc qw(Decode_Date_US); use Data::Dumper; use Benchmark qw[ cmpthese ]; sub convertdate { return sprintf '%04d%02d%02d', Decode_Date_US( $_[0] ) } our @list = ( { fname => 'Jan', lname => 'Krynicky', birth_date => 'Sep 3 1975', } +, { fname => 'Pavel', lname => 'Krynicky', birth_date => 'Dec 25 1969' +, }, { fname => 'Martin', lname => 'Krynicky', birth_date => 'Aug 24 1973 +', }, map{ { fname => $_, lname => 'Krynicky', birth_date => 'Jan 1 ' . ( 1900 + int( rand 100 ) ) } } 1 .. $ARGV[ 0 ] || 100 ); cmpthese -1, { ST => q[ my @sorted = map{ $_->[1] } sort{ $a->[0] <=> $b->[0] } map{ [ convertdate($_->{birth_date}), $_ ] } @list; ], KEY => q[ { my @keys = map convertdate($_->{birth_date}), @list; @sorted = @list[ sort {$keys[$a] cmp $keys[$b]} (0..$#list +) ]; } ], GRTish => q[ my @sorted = @list[ map { substr( $_, 8 ) } sort map { convertdate( $list[$_]->{birth_date} ) . $_ } 0 .. $#list ]; ], };

    the keyed sort works out between 20 and 35% faster for this application.

    s/iter ST GRTish KEY ST 1.67 -- -4% -24% GRTish 1.61 4% -- -21% KEY 1.27 32% 27% -- C:\test>junk9 1 Rate GRTish ST KEY GRTish 21914/s -- -10% -19% ST 24349/s 11% -- -10% KEY 27113/s 24% 11% -- C:\test>junk9 1e2 Rate GRTish ST KEY GRTish 986/s -- -1% -16% ST 997/s 1% -- -15% KEY 1179/s 20% 18% -- C:\test>junk9 1e3 Rate ST GRTish KEY ST 71.2/s -- -11% -25% GRTish 80.3/s 13% -- -16% KEY 95.1/s 34% 19% -- C:\test>junk9 1e4 Rate ST GRTish KEY ST 6.31/s -- -9% -23% GRTish 6.90/s 9% -- -16% KEY 8.23/s 30% 19% --

    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.

      With only three items in the list, you're never going to see the performance advantage of the default sort; all the other effects dominate. Try it with 1_000 or 10_000 items.

      Update: Wups - I totally missed the fact that the number of items in the list is controlled by a run-time parameter. Sorry.

      So I did my own benchmark, and I found that what BrowserUk calls 'GRTish' was faster. Why? One difference between his benchmark and mine was the key calculation routine, convertdate. I didn't have Date::Calc installed, and didn't want to install it just for this test, so I made mine be simply

      sprintf "%20s", $_[0]
      Could that be the source of the difference in results?

      To check this possibility, I made my key calculation more expensive:

      sprintf '%20s', 2.2 / ( 2.2 / ( int( $_[0] * 8.00001 ) >> 3 ) )
      (That's a couple of chained math no-ops).

      Sure enough, KEY performed better than GRTish. And Decode_Date_US is, no doubt, a very costly function.

      Still — Why should the performance of this key calculation routine make such a difference? It's called exactly once per list item in both cases. And why is my expectation that the default sort (i.e. with no comparison block) should trounce other, non-default sorts completely unrealized here? Does sort have some nifty optimizations in 5.8?

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://598656]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others imbibing at the Monastery: (4)
As of 2014-09-30 21:39 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    How do you remember the number of days in each month?











    Results (384 votes), past polls