Beefy Boxes and Bandwidth Generously Provided by pair Networks
Just another Perl shrine
 
PerlMonks  

Re: The High Price of Golf, and A Surprise

by bart (Canon)
on Sep 07, 2002 at 09:09 UTC ( #195859=note: print w/ replies, xml ) Need Help??


in reply to The High Price of Golf, and A Surprise

You should include cmp in your benchmark. I think you'll find it slower than the built-in sort. That speed difference is the overhead of the callback sub.

I expect that cmp in general is a far slower operation than <=>. The latter only takes one CPU instruction (plus the overhead of the interpreter), the former is a slow library call for many strings — especially if the strings you compare are identical, because now all characters in the strings have to be examined.

And I think what you found is that the callback overhead for sort is still faster than the speed difference between these two ops.

Enough blah blah. I've added them to your benchmark code, so it now looks like:

cmpthese( $count, { f_owtdi => sub { @rfl = sort { $a - $b } @nfl }, i_owtdi => sub { @rin = sort { $a - $b } @nin }, f_sship => sub { @rfl = sort { $a <=> $b } @nfl }, i_sship => sub { @rin = sort { $a <=> $b } @nin }, f_alpha => sub { @sfl = sort @afl }, i_alpha => sub { @sin = sort @ain }, f_cmp => sub { @sfl = sort { $a cmp $b } @afl }, i_cmp => sub { @sin = sort { $a cmp $b } @ain }, });

I'll update with the results in a moment. This benchmark takes many minutes to run, and I don't want to skew the results by doing heavy stuff with my humble PC.

Update: I'm back. Here are the results:

Benchmark: timing 300 iterations of f_alpha, f_cmp, f_owtdi, f_sship, +i_alpha, i_cmp, i_owtdi, i_sship... f_alpha: 26 wallclock secs (25.92 usr + 0.00 sys = 25.92 CPU) @ 11 +.57/s (n=300) f_cmp: 26 wallclock secs (25.87 usr + 0.00 sys = 25.87 CPU) @ 11 +.60/s (n=300) f_owtdi: 37 wallclock secs (36.58 usr + 0.00 sys = 36.58 CPU) @ 8 +.20/s (n=300) f_sship: 17 wallclock secs (16.92 usr + 0.00 sys = 16.92 CPU) @ 17 +.73/s (n=300) i_alpha: 21 wallclock secs (20.76 usr + 0.00 sys = 20.76 CPU) @ 14 +.45/s (n=300) i_cmp: 20 wallclock secs (20.82 usr + 0.00 sys = 20.82 CPU) @ 14 +.41/s (n=300) i_owtdi: 35 wallclock secs (35.53 usr + 0.00 sys = 35.53 CPU) @ 8 +.44/s (n=300) i_sship: 16 wallclock secs (15.87 usr + 0.00 sys = 15.87 CPU) @ 18 +.90/s (n=300) Rate f_owtdi i_owtdi f_alpha f_cmp i_cmp i_alpha f_sship + i_sship f_owtdi 8.20/s -- -3% -29% -29% -43% -43% -54% + -57% i_owtdi 8.44/s 3% -- -27% -27% -41% -42% -52% + -55% f_alpha 11.6/s 41% 37% -- -0% -20% -20% -35% + -39% f_cmp 11.6/s 41% 37% 0% -- -20% -20% -35% + -39% i_cmp 14.4/s 76% 71% 24% 24% -- -0% -19% + -24% i_alpha 14.5/s 76% 71% 25% 25% 0% -- -18% + -24% f_sship 17.7/s 116% 110% 53% 53% 23% 23% -- + -6% i_sship 18.9/s 130% 124% 63% 63% 31% 31% 7% + --
That's odd. there is NO speed difference between [fi]_cmp and [fi]_alpha. That means that the cost of the callback is negligable... unless Perl is optimizing the callback away?

Just to make sure, I've also swapped $a and $b in my callback sub. It doesn't make a difference.

Conclusion: the speed gain you get by using numerical sort, is entirely due to the speed difference between the ops <=> and cmp.


Comment on Re: The High Price of Golf, and A Surprise
Select or Download Code
Re^2: The High Price of Golf, and A Surprise
by Aristotle (Chancellor) on Sep 07, 2002 at 11:25 UTC
    Interesting:
    $ perl -MO=Deparse,x7 -e'sort { $a cmp $b } @x' sort @x; -e syntax OK $ perl -MO=Deparse,x7 -e'sort { $b cmp $a } @x' sort @x; -e syntax OK $ perl -MO=Deparse,x7 -e'sort { $a <=> $b } @x' sort @x; -e syntax OK $ perl -MO=Deparse,x7 -e'sort { $b <=> $a } @x' sort @x; -e syntax OK
    So far it looks as though B::Deparse is broken.
    $ perl -MO=Deparse,x7 -e'sort { $a->[0] cmp $b->[0] } @x' sort {$$a[0] cmp $$b[0];} @x; -e syntax OK
    Obviously, it is not. Looks rather like Perl has builtins for all the common cases, then.

    Makeshifts last the longest.

      That's very interesting. I had noticed the same thing. Your conclusion might very well be correct, and I hadn't even thought of that possibility.

      Could the next be a special common case, too?

      perl -MO=Deparse,x7 -e'sort { $b cmp $a } @x' sort @x; -e syntax OK
      Well well well, it looks like it. It would explain why there appears to be no overhead for the callback. It can also explain the huge speed difference between "<=>" and "-": only the latter actually uses the callback. Otherwise, it would make no sense: "-" is a very fast operator, and checking the sign of the result should be just as fast. So it must be the overhead of the callback.

      Nevertheless, B::Deparse is broken, because the code it produces is obviously not equivalent to the original source code.

        That would be broken by necessity rather than in implementation. There simply is no way to represent a numerical sort without putting a callback in the source, but the compiled form of that sort doesn't use any callback. The choice is to represent either the original or the compiled form more closely. I prefer the current behaviour of emphasizing the compiled form since that's what the module is for, after all. Maybe that should be, if implementable, a user selectable option.

        Good catch about the reason of the - vs <=> slowdown. I didn't make that (so very obvious, in retrospect) connection.

        Makeshifts last the longest.

        Just for the record: firstly, it used to be that even simplistic sort subs such as { $a <=> $b } would call out to perl code, but that was changed to look for certain common cases and substitute a C implementation instead; I think that change first appeared in 5.6.1, judging by comparative timings here.

        Secondly, the bug in B::Deparse (which occurred simply because it wasn't immediately updated for the above change) was fixed in perl-5.8.0.

        Just to give a feel for the difference it makes spotting and substituting a C-based routine for a perl-based one, I added an extra variant of i_sship tweaked to avoid the optimization:

        i_sshipz => sub { @rin = sort { $a + 0 <=> $b } @nin },
        and here is the table I got (running under 5.8.0):
        Rate i_sshipz f_owtdi i_owtdi f_alpha i_alpha f_sship + i_sship i_sshipz 16.5/s -- -27% -28% -43% -52% -57% + -58% f_owtdi 22.6/s 37% -- -1% -22% -34% -41% + -42% i_owtdi 22.8/s 38% 1% -- -21% -33% -41% + -41% f_alpha 29.0/s 76% 28% 27% -- -15% -25% + -26% i_alpha 34.2/s 107% 51% 50% 18% -- -11% + -12% f_sship 38.5/s 134% 70% 69% 33% 13% -- + -1% i_sship 38.9/s 136% 72% 71% 34% 14% 1% + --

        Hugo
      Small note which makes no difference to the deparsed syntax is that sort() used in a void context is a no-op (sp?). So all those sort()s could be deparsed down to
      $ perl -MO=Deparse,x7 -e'sort { $b <=> $a } @x' -e syntax OK

      _________
      broquaint

        The output would have been empty then; it is apparently not considered a NOP either:
        $ perl -MO=Deparse,x7 -e'1;' '???'; -e syntax OK
        ('???' is what B::Deparse produces when something gets optimized away.) I'm guessing that's because a sort may have side effects f.ex with tied arrays or inside the callback (if there is one), or something to that effect.

        Makeshifts last the longest.

      perldoc perldelta, for 5.6.1 says the following about sort:

      Simple sort() using { $a <=> $b } and the like are optimized

      Many common sort() operations using a simple inlined block are now optimized for faster performance.

      Mind you, I was under the impression that this optimisation occurred earlier, sometime back around the 5.004 to 5.005 timeframe. I'll have to go and look on some of my older machines.
      print@_{sort keys %_},$/if%_=split//,'= & *a?b:e\f/h^h!j+n,o@o;r$s-t%t#u'

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://195859]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others exploiting the Monastery: (5)
As of 2014-07-26 06:08 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    My favorite superfluous repetitious redundant duplicative phrase is:









    Results (175 votes), past polls