Benchmarking perfect shufflesby ambrus (Abbot)
|on Feb 28, 2006 at 15:51 UTC||Need Help??|
In this meditation, I'm benchmarking a couple of subroutines to uniformly shuffle an array.
I know others have done this too, for exapmle see Re^2: Randomising the order of an array. However, I've learnt much from doing this myself.
First, let's see what shuffling functions I was using. (I originally posted these at Re^2: Array Shuffle. I've removed the bless method, as it's very slow (probably because it has to numerify two strings in every comparision).)
I used six different algorithms. One of these just calls the shuffle function from List::Util, which is usually implemented as an XS functions, so it can be expected that this would be the fastest.
Two of the algorithms assign a random number to each element in the array, and sort the array using that number as a key. One of these uses a schwartzian transform, the other one uses a single array to store the keys.
Note that these sorting algorithms only give a truly uniform random permutation if the rand() function of perl gives enough random bits. This is because of the birthday-paradox, see chapter 5.3 of the Cormen – Leiserson – Rivest – Stein book for more details. I was using linux, so perl uses the drand48 function from glibc, which gives 48 good random bits, so this doesn't affect me except for large arrays. However, on at least some windows systems, the rand perl function probably calls the C rand function only once, and that gives only 15 random bits which is too few for even small arrays.
The other three algorithms used the classical swapping algorithm: take each element in the array starting from the first, and swap it with one of the elements after it (including itself). I've made three variants of this. One is the classical swap, one uses map to collect the swapped-back data, and one uses shift to remove elements from the front of the array instead. (Using shift was not my idea, someone on the cb recommended it earlier, but I can't remember who he was, sorry.)
On this algorithm, see both the above mentioned chapter in the Cormen book, or chapter 3.4.2 of Knuth's TAOCP.
Here's the code I was using.
And here's the output, using v5.8.8 on i686-linux.
Ok, so what can we learn from all this output?
As we have expected, the XS function was much faster than any other method.
The next thing we can see is that the sorting solutions (schwartzian and weight) were slower than any of the combinatorical ones, and that this disadvantage is even more significant with larger arrays. This is sort of logical, after all, sorting takes O(n·log n) time but the other algorithms take only O(n).
We can also see that the schwartzian method is slower than the one using only one array. This was surprising to me when I first met it, but then tye and Aristotle explained to me in the replies of Re: Benchmark, -s versus schwartzian why this has to be like this. (Here, I don't implement the hash lookup approach shown in that thread, only the array lookup one.)
I also wonder if it's possible to speed up this method by using the variant of the schwartzian transform where you map the data to strings that you can sort with one of the built-in sorting functions, see Re^3: Benchmark, -s versus schwartzian (vs hash) or •Re^2: Benchmark, -s versus schwartzian or fast, flexible, stable sort. (Apparently that's called GRT. Can anyone tell what that stands for? (Update: bart says it's the Guttman-Rosler transform, see also this slide.)) Maybe I'll try that later.
Now let's examine the faster, combinatorical solutions (swap, map, shift).
All three solutions have about the same speed. The (fortran-like) swapping solution is the fastest for larger arrays (from say 4K elements), but the other two are faster for small arrays. This is probably because it doesn't have to do memory managment. Finally, the map solution is consistently faster than the shift solution, but only by a small amount.