Beefy Boxes and Bandwidth Generously Provided by pair Networks
laziness, impatience, and hubris
 
PerlMonks  

comment on

( [id://3333]=superdoc: print w/replies, xml ) Need Help??

In this meditation, I'm benchmarking a couple of subroutines to uniformly shuffle an array.

I know others have done this too, for exapmle see Re^2: Randomising the order of an array. However, I've learnt much from doing this myself.

First, let's see what shuffling functions I was using. (I originally posted these at Re^2: Array Shuffle. I've removed the bless method, as it's very slow (probably because it has to numerify two strings in every comparision).)

I used six different algorithms. One of these just calls the shuffle function from List::Util, which is usually implemented as an XS functions, so it can be expected that this would be the fastest.

Two of the algorithms assign a random number to each element in the array, and sort the array using that number as a key. One of these uses a schwartzian transform, the other one uses a single array to store the keys.

Note that these sorting algorithms only give a truly uniform random permutation if the rand() function of perl gives enough random bits. This is because of the birthday-paradox, see chapter 5.3 of the Cormen – Leiserson – Rivest – Stein book for more details. I was using linux, so perl uses the drand48 function from glibc, which gives 48 good random bits, so this doesn't affect me except for large arrays. However, on at least some windows systems, the rand perl function probably calls the C rand function only once, and that gives only 15 random bits which is too few for even small arrays.

The other three algorithms used the classical swapping algorithm: take each element in the array starting from the first, and swap it with one of the elements after it (including itself). I've made three variants of this. One is the classical swap, one uses map to collect the swapped-back data, and one uses shift to remove elements from the front of the array instead. (Using shift was not my idea, someone on the cb recommended it earlier, but I can't remember who he was, sorry.)

On this algorithm, see both the above mentioned chapter in the Cormen book, or chapter 3.4.2 of Knuth's TAOCP.

Here's the code I was using.

use warnings; use strict; use Benchmark ":all"; use IO::Handle; my %shuffle; use List::Util; $shuffle{"xs"} = \&List::Util::shuffle; $shuffle{"swap"} = sub { my @p = @_; my $k; $k = $_ + int(rand(@p - $_)), @p[$_, $k] = @p[$k, $_] for 0 .. @p - 1; @p; }; $shuffle{"map"} = sub { my @p = @_; map { splice @p, $_ + rand(@p - $_), 1, $p[$_] } 0 .. @p - 1; }; $shuffle{"shift"} = sub { my @p = @_; my $t; map { $t = splice @p, rand(@p), 1, $p[0]; shift @p; $t } 0 .. +@p - 1; }; $shuffle{"weight"} = sub { my @w = map { rand } @_; @_[sort { $w[$a] <=> $w[$b] } 0 .. @_ - 1]; }; $shuffle{"schwartzian"} = sub { map { $$_[1] } sort { $$a[0] <=> $$b[0] } map { [rand, $_] } @ +_; }; if (0) { $shuffle{"bless"} = sub { map { $$_[0] } sort { ref $a <=> ref $b } map { bless [$_], ra +nd } @_; }; } if (0) { my $N = 0 + ($ARGV[0] || 20); my $T = 0 + ($ARGV[1] || 1000); for my $n (keys(%shuffle)) { my %f; print $n; my $a = $shuffle{$n}; for (1 .. $T) { my @a = 1 .. $N; $f{join(" ", &$a(@a))}++; } for (sort keys(%f)) { printf "%5d: %s\n", $f{$_}, $_; } print 0+keys(%f), "\n\n"; } } my @sd; my %shtest = map { my($n, $a) = ($_, $shuffle{$_}); $n, sub { my @r = &$a(@sd); } } keys(%shuffle); for my $N (map { 1 + 2 * 3**$_ } 0 .. 8) { print "testing on $N elements\n"; @sd = my @sd0 = map { rand } 1 .. $N; cmpthese(-5, \%shtest); join(";", @sd) eq join(";", @sd0) or die "error: some of the shuffle functions have changed the arr +ay"; flush STDOUT; } print "done\n"; __END__

And here's the output, using v5.8.8 on i686-linux.

testing on 3 elements Rate schwartzian swap weight shift ma +p xs schwartzian 131923/s -- -1% -15% -20% -23 +% -88% swap 133815/s 1% -- -13% -19% -22 +% -88% weight 154592/s 17% 16% -- -6% -10 +% -86% shift 165196/s 25% 23% 7% -- -4 +% -86% map 172130/s 30% 29% 11% 4% - +- -85% xs 1140464/s 764% 752% 638% 590% 563 +% -- testing on 7 elements Rate schwartzian weight swap shift map + xs schwartzian 61200/s -- -15% -20% -27% -29% + -90% weight 72048/s 18% -- -5% -13% -17% + -88% swap 76058/s 24% 6% -- -9% -12% + -87% shift 83283/s 36% 16% 10% -- -4% + -86% map 86718/s 42% 20% 14% 4% -- + -85% xs 592396/s 868% 722% 679% 611% 583% + -- testing on 19 elements Rate schwartzian weight swap shift map + xs schwartzian 21339/s -- -15% -32% -33% -43% + -91% weight 25214/s 18% -- -20% -21% -32% + -90% swap 31452/s 47% 25% -- -2% -16% + -87% shift 31957/s 50% 27% 2% -- -14% + -87% map 37239/s 75% 48% 18% 17% -- + -85% xs 243980/s 1043% 868% 676% 663% 555% + -- testing on 55 elements Rate schwartzian weight swap shift map + xs schwartzian 6790/s -- -17% -41% -42% -49% + -92% weight 8154/s 20% -- -29% -30% -39% + -90% swap 11492/s 69% 41% -- -1% -14% + -86% shift 11650/s 72% 43% 1% -- -13% + -86% map 13400/s 97% 64% 17% 15% -- + -84% xs 83719/s 1133% 927% 628% 619% 525% + -- testing on 163 elements Rate schwartzian weight shift swap map + xs schwartzian 1902/s -- -19% -52% -52% -58% + -93% weight 2356/s 24% -- -41% -41% -48% + -92% shift 3982/s 109% 69% -- -0% -13% + -86% swap 3998/s 110% 70% 0% -- -12% + -86% map 4559/s 140% 93% 14% 14% -- + -84% xs 28869/s 1418% 1125% 625% 622% 533% + -- testing on 487 elements Rate schwartzian weight shift swap map + xs schwartzian 515/s -- -20% -60% -61% -65% + -95% weight 644/s 25% -- -50% -51% -56% + -93% shift 1292/s 151% 101% -- -2% -12% + -86% swap 1321/s 157% 105% 2% -- -10% + -86% map 1467/s 185% 128% 14% 11% -- + -84% xs 9413/s 1728% 1363% 629% 612% 542% + -- testing on 1459 elements Rate schwartzian weight shift swap map + xs schwartzian 121/s -- -28% -69% -70% -71% + -95% weight 168/s 39% -- -57% -58% -59% + -94% shift 396/s 226% 135% -- -1% -4% + -85% swap 401/s 230% 138% 1% -- -3% + -85% map 414/s 241% 146% 4% 3% -- + -84% xs 2656/s 2088% 1477% 570% 563% 542% + -- testing on 4375 elements Rate schwartzian weight map shift swap + xs schwartzian 20.8/s -- -33% -71% -75% -76% + -94% weight 31.1/s 50% -- -57% -63% -64% + -92% map 72.6/s 249% 133% -- -14% -16% + -81% shift 84.7/s 307% 172% 17% -- -2% + -77% swap 86.6/s 317% 178% 19% 2% -- + -77% xs 376/s 1709% 1108% 418% 344% 334% + -- testing on 13123 elements Rate schwartzian weight map shift swap + xs schwartzian 4.46/s -- -28% -72% -73% -76% + -95% weight 6.18/s 39% -- -61% -63% -67% + -92% map 15.7/s 252% 154% -- -6% -16% + -81% shift 16.8/s 277% 172% 7% -- -10% + -80% swap 18.7/s 319% 203% 19% 11% -- + -77% xs 82.0/s 1739% 1227% 422% 388% 339% + -- done

Ok, so what can we learn from all this output?

As we have expected, the XS function was much faster than any other method.

The next thing we can see is that the sorting solutions (schwartzian and weight) were slower than any of the combinatorical ones, and that this disadvantage is even more significant with larger arrays. This is sort of logical, after all, sorting takes O(n·log n) time but the other algorithms take only O(n).

We can also see that the schwartzian method is slower than the one using only one array. This was surprising to me when I first met it, but then tye and Aristotle explained to me in the replies of Re: Benchmark, -s versus schwartzian why this has to be like this. (Here, I don't implement the hash lookup approach shown in that thread, only the array lookup one.)

I also wonder if it's possible to speed up this method by using the variant of the schwartzian transform where you map the data to strings that you can sort with one of the built-in sorting functions, see Re^3: Benchmark, -s versus schwartzian (vs hash) or •Re^2: Benchmark, -s versus schwartzian or fast, flexible, stable sort. (Apparently that's called GRT. Can anyone tell what that stands for? (Update: bart says it's the Guttman-Rosler transform, see also this slide.)) Maybe I'll try that later.

Now let's examine the faster, combinatorical solutions (swap, map, shift).

All three solutions have about the same speed. The (fortran-like) swapping solution is the fastest for larger arrays (from say 4K elements), but the other two are faster for small arrays. This is probably because it doesn't have to do memory managment. Finally, the map solution is consistently faster than the shift solution, but only by a small amount.


In reply to Benchmarking perfect shuffles by ambrus

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":



  • Are you posting in the right place? Check out Where do I post X? to know for sure.
  • Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
    <code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
  • Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
  • Want more info? How to link or How to display code and escape characters are good places to start.
Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others making s'mores by the fire in the courtyard of the Monastery: (2)
As of 2024-04-19 19:40 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found