Beefy Boxes and Bandwidth Generously Provided by pair Networks
go ahead... be a heretic
 
PerlMonks  

comment on

( [id://3333]=superdoc: print w/replies, xml ) Need Help??

It occured to me, in a fit of dementia, that a numeric sort with subtraction instead of the <=> operator would shave two strokes in Perl Golf. I became curious about the cost of doing that.

I threw together a benchmark and got a surprise.

No, the subtraction method isn't fast: in fact, it more than doubles the time taken in sorting. The surprise is that the spaceship operator is faster than dictionary sort.

Received wisdom in the Perl community is that its string sort is optimum and there is a price to pay for numeric sort. In this benchmark, each type of sort receives the same data, but already cast to float, integer, or string. As much extra processing as possible is outside the timing loop, so that only sort performance is measured, without any representation penalties for one method over another. Spaceship is not penalized by a string-to-float conversion, for instance. The numbers are the same, but each method gets them in predigested form. No measured time is spent on my or other allocation, since the result is stored in a preallocated array, sized to the data it will receive.

The data (Edit - New, from corrected code):

$ perl sortbench.pl Benchmark: timing 1000 iterations of f_alpha, f_owtdi, f_sship, i_alph +a, i_owtdi, i_sship... f_alpha: 57 wallclock secs (51.10 usr + 0.02 sys = 51.12 CPU) @ 19 +.56/s (n=1000) f_owtdi: 92 wallclock secs (83.53 usr + 0.01 sys = 83.54 CPU) @ 11 +.97/s (n=1000) f_sship: 37 wallclock secs (33.55 usr + 0.00 sys = 33.55 CPU) @ 29 +.81/s (n=1000) i_alpha: 41 wallclock secs (36.44 usr + 0.01 sys = 36.45 CPU) @ 27 +.43/s (n=1000) i_owtdi: 91 wallclock secs (82.73 usr + 0.02 sys = 82.75 CPU) @ 12 +.08/s (n=1000) i_sship: 38 wallclock secs (32.81 usr + 0.00 sys = 32.81 CPU) @ 30 +.48/s (n=1000) Rate f_owtdi i_owtdi f_alpha i_alpha f_sship i_sship f_owtdi 12.0/s -- -1% -39% -56% -60% -61% i_owtdi 12.1/s 1% -- -38% -56% -59% -60% f_alpha 19.6/s 63% 62% -- -29% -34% -36% i_alpha 27.4/s 129% 127% 40% -- -8% -10% f_sship 29.8/s 149% 147% 52% 9% -- -2% i_sship 30.5/s 155% 152% 56% 11% 2% --
Each is a sort of 10000 random numbers from -1000 0 to 1000, repeated 1000 times. The array to sort is generated numerically as floats. It is then translated to integers, leading zero fixed point decimal strings, and leading zero integer strings. An array to assign for each type is preallocated.

It appears to me that spaceship's reputed slowness goes away when its data is already numeric, and the arena is levelled by insisting on equivalent data for both forms of sort.

Here is the benchmark code (Edit - Corrected to avoid negative range and to word align strings):

#!/usr/bin/perl -w use strict; use Benchmark 'cmpthese'; my $count = 1000; my @nfl = map { int( rand(1000) ) } 1..10000; my @nin = map { int } @nfl; my @ain = map { sprintf "%03d", $_ } @nin; my @afl = map { my $str = sprintf "%3.12d", $_; '0' x (3 - index $str, '.') . $str; } @nfl; my @rin = (int 1) x 10000; my @rfl = (1.01) x 10000; my @sfl = ('z' x 16) x 10000; my @sin = ('z' x 4) x 10000; cmpthese( $count, { f_owtdi => sub { @rfl = sort { $a - $b } @nfl }, i_owtdi => sub { @rin = sort { $a - $b } @nin }, f_sship => sub { @rfl = sort { $a <=> $b } @nfl }, i_sship => sub { @rin = sort { $a <=> $b } @nin }, f_alpha => sub { @sfl = sort @afl }, i_alpha => sub { @sin = sort @ain }, });

Update: Code and data corrections, thanks to bart for the spot.

After Compline,
Zaxo

Edit by tye to change PRE to CODE around wide lines


In reply to The High Price of Golf, and A Surprise by Zaxo

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":



  • Are you posting in the right place? Check out Where do I post X? to know for sure.
  • Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
    <code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
  • Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
  • Want more info? How to link or How to display code and escape characters are good places to start.
Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others exploiting the Monastery: (4)
As of 2024-04-19 23:56 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found