Beefy Boxes and Bandwidth Generously Provided by pair Networks
Keep It Simple, Stupid
 
PerlMonks  

The High Price of Golf, and A Surprise

by Zaxo (Archbishop)
on Sep 07, 2002 at 08:28 UTC ( #195857=perlmeditation: print w/ replies, xml ) Need Help??

It occured to me, in a fit of dementia, that a numeric sort with subtraction instead of the <=> operator would shave two strokes in Perl Golf. I became curious about the cost of doing that.

I threw together a benchmark and got a surprise.

No, the subtraction method isn't fast: in fact, it more than doubles the time taken in sorting. The surprise is that the spaceship operator is faster than dictionary sort.

Received wisdom in the Perl community is that its string sort is optimum and there is a price to pay for numeric sort. In this benchmark, each type of sort receives the same data, but already cast to float, integer, or string. As much extra processing as possible is outside the timing loop, so that only sort performance is measured, without any representation penalties for one method over another. Spaceship is not penalized by a string-to-float conversion, for instance. The numbers are the same, but each method gets them in predigested form. No measured time is spent on my or other allocation, since the result is stored in a preallocated array, sized to the data it will receive.

The data (Edit - New, from corrected code):

$ perl sortbench.pl Benchmark: timing 1000 iterations of f_alpha, f_owtdi, f_sship, i_alph +a, i_owtdi, i_sship... f_alpha: 57 wallclock secs (51.10 usr + 0.02 sys = 51.12 CPU) @ 19 +.56/s (n=1000) f_owtdi: 92 wallclock secs (83.53 usr + 0.01 sys = 83.54 CPU) @ 11 +.97/s (n=1000) f_sship: 37 wallclock secs (33.55 usr + 0.00 sys = 33.55 CPU) @ 29 +.81/s (n=1000) i_alpha: 41 wallclock secs (36.44 usr + 0.01 sys = 36.45 CPU) @ 27 +.43/s (n=1000) i_owtdi: 91 wallclock secs (82.73 usr + 0.02 sys = 82.75 CPU) @ 12 +.08/s (n=1000) i_sship: 38 wallclock secs (32.81 usr + 0.00 sys = 32.81 CPU) @ 30 +.48/s (n=1000) Rate f_owtdi i_owtdi f_alpha i_alpha f_sship i_sship f_owtdi 12.0/s -- -1% -39% -56% -60% -61% i_owtdi 12.1/s 1% -- -38% -56% -59% -60% f_alpha 19.6/s 63% 62% -- -29% -34% -36% i_alpha 27.4/s 129% 127% 40% -- -8% -10% f_sship 29.8/s 149% 147% 52% 9% -- -2% i_sship 30.5/s 155% 152% 56% 11% 2% --
Each is a sort of 10000 random numbers from -1000 0 to 1000, repeated 1000 times. The array to sort is generated numerically as floats. It is then translated to integers, leading zero fixed point decimal strings, and leading zero integer strings. An array to assign for each type is preallocated.

It appears to me that spaceship's reputed slowness goes away when its data is already numeric, and the arena is levelled by insisting on equivalent data for both forms of sort.

Here is the benchmark code (Edit - Corrected to avoid negative range and to word align strings):

#!/usr/bin/perl -w use strict; use Benchmark 'cmpthese'; my $count = 1000; my @nfl = map { int( rand(1000) ) } 1..10000; my @nin = map { int } @nfl; my @ain = map { sprintf "%03d", $_ } @nin; my @afl = map { my $str = sprintf "%3.12d", $_; '0' x (3 - index $str, '.') . $str; } @nfl; my @rin = (int 1) x 10000; my @rfl = (1.01) x 10000; my @sfl = ('z' x 16) x 10000; my @sin = ('z' x 4) x 10000; cmpthese( $count, { f_owtdi => sub { @rfl = sort { $a - $b } @nfl }, i_owtdi => sub { @rin = sort { $a - $b } @nin }, f_sship => sub { @rfl = sort { $a <=> $b } @nfl }, i_sship => sub { @rin = sort { $a <=> $b } @nin }, f_alpha => sub { @sfl = sort @afl }, i_alpha => sub { @sin = sort @ain }, });

Update: Code and data corrections, thanks to bart for the spot.

After Compline,
Zaxo

Edit by tye to change PRE to CODE around wide lines

Comment on The High Price of Golf, and A Surprise
Select or Download Code
Re: The High Price of Golf, and A Surprise
by bart (Canon) on Sep 07, 2002 at 09:09 UTC
    You should include cmp in your benchmark. I think you'll find it slower than the built-in sort. That speed difference is the overhead of the callback sub.

    I expect that cmp in general is a far slower operation than <=>. The latter only takes one CPU instruction (plus the overhead of the interpreter), the former is a slow library call for many strings — especially if the strings you compare are identical, because now all characters in the strings have to be examined.

    And I think what you found is that the callback overhead for sort is still faster than the speed difference between these two ops.

    Enough blah blah. I've added them to your benchmark code, so it now looks like:

    cmpthese( $count, { f_owtdi => sub { @rfl = sort { $a - $b } @nfl }, i_owtdi => sub { @rin = sort { $a - $b } @nin }, f_sship => sub { @rfl = sort { $a <=> $b } @nfl }, i_sship => sub { @rin = sort { $a <=> $b } @nin }, f_alpha => sub { @sfl = sort @afl }, i_alpha => sub { @sin = sort @ain }, f_cmp => sub { @sfl = sort { $a cmp $b } @afl }, i_cmp => sub { @sin = sort { $a cmp $b } @ain }, });

    I'll update with the results in a moment. This benchmark takes many minutes to run, and I don't want to skew the results by doing heavy stuff with my humble PC.

    Update: I'm back. Here are the results:

    Benchmark: timing 300 iterations of f_alpha, f_cmp, f_owtdi, f_sship, +i_alpha, i_cmp, i_owtdi, i_sship... f_alpha: 26 wallclock secs (25.92 usr + 0.00 sys = 25.92 CPU) @ 11 +.57/s (n=300) f_cmp: 26 wallclock secs (25.87 usr + 0.00 sys = 25.87 CPU) @ 11 +.60/s (n=300) f_owtdi: 37 wallclock secs (36.58 usr + 0.00 sys = 36.58 CPU) @ 8 +.20/s (n=300) f_sship: 17 wallclock secs (16.92 usr + 0.00 sys = 16.92 CPU) @ 17 +.73/s (n=300) i_alpha: 21 wallclock secs (20.76 usr + 0.00 sys = 20.76 CPU) @ 14 +.45/s (n=300) i_cmp: 20 wallclock secs (20.82 usr + 0.00 sys = 20.82 CPU) @ 14 +.41/s (n=300) i_owtdi: 35 wallclock secs (35.53 usr + 0.00 sys = 35.53 CPU) @ 8 +.44/s (n=300) i_sship: 16 wallclock secs (15.87 usr + 0.00 sys = 15.87 CPU) @ 18 +.90/s (n=300) Rate f_owtdi i_owtdi f_alpha f_cmp i_cmp i_alpha f_sship + i_sship f_owtdi 8.20/s -- -3% -29% -29% -43% -43% -54% + -57% i_owtdi 8.44/s 3% -- -27% -27% -41% -42% -52% + -55% f_alpha 11.6/s 41% 37% -- -0% -20% -20% -35% + -39% f_cmp 11.6/s 41% 37% 0% -- -20% -20% -35% + -39% i_cmp 14.4/s 76% 71% 24% 24% -- -0% -19% + -24% i_alpha 14.5/s 76% 71% 25% 25% 0% -- -18% + -24% f_sship 17.7/s 116% 110% 53% 53% 23% 23% -- + -6% i_sship 18.9/s 130% 124% 63% 63% 31% 31% 7% + --
    That's odd. there is NO speed difference between [fi]_cmp and [fi]_alpha. That means that the cost of the callback is negligable... unless Perl is optimizing the callback away?

    Just to make sure, I've also swapped $a and $b in my callback sub. It doesn't make a difference.

    Conclusion: the speed gain you get by using numerical sort, is entirely due to the speed difference between the ops <=> and cmp.

      Interesting:
      $ perl -MO=Deparse,x7 -e'sort { $a cmp $b } @x' sort @x; -e syntax OK $ perl -MO=Deparse,x7 -e'sort { $b cmp $a } @x' sort @x; -e syntax OK $ perl -MO=Deparse,x7 -e'sort { $a <=> $b } @x' sort @x; -e syntax OK $ perl -MO=Deparse,x7 -e'sort { $b <=> $a } @x' sort @x; -e syntax OK
      So far it looks as though B::Deparse is broken.
      $ perl -MO=Deparse,x7 -e'sort { $a->[0] cmp $b->[0] } @x' sort {$$a[0] cmp $$b[0];} @x; -e syntax OK
      Obviously, it is not. Looks rather like Perl has builtins for all the common cases, then.

      Makeshifts last the longest.

        That's very interesting. I had noticed the same thing. Your conclusion might very well be correct, and I hadn't even thought of that possibility.

        Could the next be a special common case, too?

        perl -MO=Deparse,x7 -e'sort { $b cmp $a } @x' sort @x; -e syntax OK
        Well well well, it looks like it. It would explain why there appears to be no overhead for the callback. It can also explain the huge speed difference between "<=>" and "-": only the latter actually uses the callback. Otherwise, it would make no sense: "-" is a very fast operator, and checking the sign of the result should be just as fast. So it must be the overhead of the callback.

        Nevertheless, B::Deparse is broken, because the code it produces is obviously not equivalent to the original source code.

        Small note which makes no difference to the deparsed syntax is that sort() used in a void context is a no-op (sp?). So all those sort()s could be deparsed down to
        $ perl -MO=Deparse,x7 -e'sort { $b <=> $a } @x' -e syntax OK

        _________
        broquaint

        perldoc perldelta, for 5.6.1 says the following about sort:

        Simple sort() using { $a <=> $b } and the like are optimized

        Many common sort() operations using a simple inlined block are now optimized for faster performance.

        Mind you, I was under the impression that this optimisation occurred earlier, sometime back around the 5.004 to 5.005 timeframe. I'll have to go and look on some of my older machines.
        print@_{sort keys %_},$/if%_=split//,'= & *a?b:e\f/h^h!j+n,o@o;r$s-t%t#u'

Re: The High Price of Golf, and A Surprise
by ChemBoy (Priest) on Sep 07, 2002 at 23:55 UTC

    My memory of this issue was that the unmodified sort was optimized in older versions of perl, but that recent editions had optimized all four of the common sorts (numeric and standard, forward and backwards). Fortunately (in some sense) I'm equipped to test this hypothesis, having some older installs to work with. I used bart's benchmarking code, for completeness.

    It would seem, if I'm reading this right, that the optimization of the basic sort in older versions is just barely enough to make it faster than numeric sort: now that the numeric sort is optimized as well, its inherent speed advantage shows. In 5.6.1, all of the sorts are faster, but the spaceship and cmp sorts improve by much more than the others, because the callback overhead referred to elswhere in this thread is removed.



    If God had meant us to fly, he would *never* have given us the railroads.
        --Michael Flanders

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlmeditation [id://195857]
Approved by Corion
Front-paged by RMGir
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others perusing the Monastery: (5)
As of 2014-12-28 11:00 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    Is guessing a good strategy for surviving in the IT business?





    Results (180 votes), past polls