Beefy Boxes and Bandwidth Generously Provided by pair Networks
There's more than one way to do things

Re^7: Perl is dying (better sorting than the ST)

by demerphq (Chancellor)
on Feb 11, 2007 at 12:07 UTC ( #599448=note: print w/replies, xml ) Need Help??

in reply to Re^2: Better sorting than the ST
in thread Perl is dying

Er, maybe im missing something here, how is an ST any different from what you have done in terms of big O. Leaving the sort aside since presumably it is as efficient either way, it appears to me that with an ST you have N steps to create the arrays, and N steps to unpack them, with your code you have N steps to create the key array, and N steps to map the sorted keys back to real values.

It strikes me that your routine might be more efficient because the unit cost of each step will be cheaper, but that seems to me to be a form of optimisation that could be argued should be left to the compiler.

So what am I missing?

Also, if you are going down this route you might as well go as far you can. According to benchmark the true GRT I whipped up for this is about 50% faster than your routine, and uses less memory (by two whole AV's :-).

use strict; use warnings; use Benchmark qw(cmpthese); my @a_of_a = map { [($_) x 5] } qw( a list of words to sort but the list needs to be fairl +y long so we will blab on a bit just because we have to which sorta sucks because it would be more interesting to write code or eat ice-cream ); my @grt; my @ari; cmpthese -2,{ grt => sub { my @grt = map { $a_of_a[unpack"x4N",$_] } sort map { pack "A4N", lc $a_of_a[$_][ 4 ], $_ } 0..$#a_of_a; }, ari => sub { my @ari = do { my @key = map { lc substr $_->[ 4 ], 0, 3 } @a_of_a; my @idx = sort { $key[ $a ] cmp $key[ $b ] } 0 .. $#key; @a_of_a[ @idx ]; }; } }; if (@grt) { for (0..$#a_of_a) { if ( $ari[$_] != $grt[$_] ) { die "$_ is bad\n"; } } } __END__ Rate ari grt ari 3946/s -- -37% grt 6260/s 59% -- Rate ari grt ari 4111/s -- -33% grt 6179/s 50% -- Rate ari grt ari 3989/s -- -38% grt 6393/s 60% -- Rate ari grt ari 4049/s -- -36% grt 6300/s 56% -- Rate ari grt ari 3982/s -- -38% grt 6467/s 62% --

Because the GRT avoids the substr, needs no indirection on the key which in turn means you dont need provide a custom sort function which in turn means the per cost of each comparison is quite a bit lower. AND you get the added bonus that the sort is stable, and a lower memory profile.

But it also arguable that this too is a form of premature optimisation.

Update: when i compare your approach to a ST like the following, i see no performance difference between it and your approach.

st => sub { my @st = map { pop @$_; $_ } sort { $a->[-1] cmp $b->[-1] } map { my $k = lc substr $_->[ 4 ], 0, 3; push @$_,$k; +$_ } @a_of_a; }

Replies are listed 'Best First'.
Re^8: Perl is dying (better sorting than the ST)
by diotalevi (Canon) on Feb 12, 2007 at 01:18 UTC

    Interestingly, when I looked at the optree above you'll notice that the custom sort sub { $a->[0] cmp $b->[0] } got entirely elided. Perhaps someone snuck in an interesting sort of optimization to the sort function where even more complicated things like that are actually performed in native C and not actually done in perl at all.

    ⠤⠤ ⠙⠊⠕⠞⠁⠇⠑⠧⠊

Re^4: Better sorting than the ST
by Aristotle (Chancellor) on Feb 13, 2007 at 04:29 UTC

    it appears to me that with an ST you have N steps to create the arrays, and N steps to unpack them, with your code you have N steps to create the key array, and N steps to map the sorted keys back to real values.

    I was talking about memory, not efficiency. Even if Perl had tuples, ST still requires extra memory for container structures proportional to the size of the list being sorted, whereas index sorting only requires a single container. Although in terms of big-O, thatís irrelevant, since if you consider the keys as well, then the ST requires extra memory on the order of n · (s + a) (where s and a are constants and stand for the overhead of a scalar and array, respectively) whereas index sorting only requires n · s + 1 · a; but these are both O(n).

    My bad.

    Of course, in Perl, thereís a huge difference between the two because n · a easily becomes a considerable quantity.

    However, I donít see how index sorting is a premature optimisation. Itís not just parsimonious with memory, itís also easier to follow (in my opinion) than the ST for someone who has never seen either idiom.

    The real way to go about making sorting faster would of course be to decouple key extraction from the comparator function, which would allow almost all sorts to be written declaratively. There was discussion about this on perl6-language back when I was subscribed, but I donít know if anything came of it.

    Makeshifts last the longest.

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://599448]
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others drinking their drinks and smoking their pipes about the Monastery: (5)
As of 2020-08-10 11:55 GMT
Find Nodes?
    Voting Booth?
    Which rocket would you take to Mars?

    Results (57 votes). Check out past polls.