Re: A bad shuffle

Replies are listed 'Best First'.
Re^2: A bad shuffle by Anonymous Monk on Mar 21, 2005 at 01:43 UTC
Actually, the original "naive_shuffle" is not a Fisher-Yates shuffle, it is a "naive shuffle" implementation. The OP's analysis is correct. His final algorithm, however, is a correct Fisher-Yates implementation.	[reply]
Re^3: A bad shuffle by jdporter (Paladin) on Mar 21, 2005 at 03:29 UTC
perlfaq4: How do I shuffle an array randomly? not only gives a "canonical" implementation of the Fisher-Yates algorithm in Perl, but also refers to the List::Util module's `shuffle` function, which is an implementation of Fisher-Yates in C. The algorithm is also implemented in Abigail's Algorithm::Numerical::Shuffle, the doco of which gives some citations into the literature (Knuth, Fisher&Yates, etc.). The Fisher-Yates has also been discussed many times on clpm. (You can do a Google Groups search. My ego compels me to link to this posting by yrs trly, which isn't about F-Y, but uses it.)	[reply] [d/l]
Re^4: A bad shuffle by tlm (Prior) on Mar 22, 2005 at 02:41 UTC
Yes, I was looking at the code for List::Util earlier today. In addition to the C implementatin of the Fisher-Yates shuffle, it includes the following "backup" in Perl: `sub shuffle (@) { my @a=\(@_); my $n; my $i=@_; map { $n = rand($i--); (${$a[$n]}, $a[$n] = $a[$i])[0]; } @_; }` [download] Pretty gnarleous, IMO. Kind of like a hybrid between F-Y and Tanktalus's shuffler. The C implementation of List::Util::shuffle is 10-20x faster than the Perl implementation. For the practical programmer: end of story. Still, to satisfy my monkly preoccupation with how many angels can lambada on the head of a pin, and more importantly, in order to reduce this dead horse to a thin protein film, I benchmarked the three shufflers: ta = Tanktalus's shuffler; lu = Perl implementation of List::Util::shuffle; rp = `random_perm`: `N = 1000 Rate rp lu ta rp 230/s -- -10% -22% lu 257/s 12% -- -13% ta 296/s 29% 15% -- N = 10_000 Rate ta lu rp ta 14.3/s -- -27% -34% lu 19.6/s 37% -- -10% rp 21.7/s 52% 11% -- N = 100_000 Rate ta lu rp ta 0.144/s -- -92% -93% lu 1.75/s 1118% -- -13% rp 2.01/s 1302% 15% --` [download] the lowliest monk	[reply] [d/l] [select]
Re^2: A bad shuffle by bart (Canon) on Mar 21, 2005 at 09:16 UTC
No, you're wrong. It looks like Fisher-Yates, but there's a slight difference, in that all his array items can move again on every loop. His "correct algorithm" is actually Fisher-Yates, where per loop, one item gets moved into its final position.	[reply]
Re^3: A bad shuffle by Anonymous Monk on Mar 21, 2005 at 15:47 UTC
Sorry Bart, I thouht you were replying to me...	[reply]
Re^4: A bad shuffle by Anonymous Monk on Mar 21, 2005 at 16:12 UTC
No, clearly his remarks were intended for me, not you.	[reply]
Re^3: A bad shuffle by Anonymous Monk on Mar 21, 2005 at 14:56 UTC
Bart, That's what I said ;-) Please re-read the node you are replying to. I said the "original ... is not a Fisher-Yates..." and "The final is a correct Fisher-Yates..." - which is exactly what you are asserting now.	[reply]
Re^2: A bad shuffle by tlm (Prior) on Mar 20, 2005 at 23:37 UTC
Thank you for the information. Perhaps I should have made clearer in that my encounter with this algorithm, which motivated the whole write up, was in research-oriented/scientific code in which it was being used to sample permutations of an array uniformly at random. My contention is that this is a misuse of this algorithm, because it does not sample the space of permutations uniformly. But in light of what you write about the algorithm's standing and pedigree, the title of my meditation is an awful one. Maybe the whole node should be retracted for the sake of not confusing others. Let me know what you think. Update: I found a version of Fisher-Yates online (linked from here): `#include <stdlib.h> void shuffle(int array, size_t n) { if (n > 1) { size_t i; for (i = 0; i < n - 1; i++) { // size_t j = i + rand()/(RAND_MAX / (n - i) + 1); // int t = array[j]; array[j] = array[i]; array[i] = t; } } }` [download] This is equivalent to the algorithm used by my `random_perm`, not the one used by `naive_shuffle`. To get the latter algorithm, the lines indicated with /*/ above would have to be changed to: `for (i = 0; i < n; i++) { size_t j = rand()/(RAND_MAX / n);` [download] the lowliest monk	[reply] [d/l] [select]
Re^3: A bad shuffle by Zaxo (Archbishop) on Mar 20, 2005 at 23:55 UTC
No, don't withdraw it. I think the thing to do is try and figure out the difference between a "fair shuffle" and a uniformly distributed selection over permutations. Are the two the same if there are identical cards in the deck? What is the problem that motivated this? Sometimes confusion, like greed, is good. After Compline, Zaxo	[reply]


Problems? Is your data what you think it is?
	PerlMonks