in reply to Re: FisherYates theory in thread FisherYates theory
UPDATE: As pointed out by others, I made an error when translating the code. See their summaries for good explanations. Cheers, and thanks everyone. (tail between legs)
In a previous article, I was challenged for doubting the effectiveness of the FisherYates shuffle as described in perlfaq.
Below, I have written code that exhausts all possible random sequences that could be used during a particular FisherYates shuffle. Statistically, this should be valid, as before the shuffle begins, there is an equal chance that the random sequence generated could be 0 0 0 0 0 as 0 1 2 3 4 as 4 4 4 4 4. By exhaustively executing the FisherYates shuffle, and calculating the total number of occurrences that each result set is produced, we can determine whether the FisherYates shuffle has the side effect of weighting the results, or whether the shuffle is truly random, in that it should be approximately well spread out.
my $depth = 5;
my %results;
sub recurse
{
if (@_ >= $depth) {
my @deck = (1 .. $depth);
shuffle(\@deck, [@_]);
$results{join('', @deck)}++;
} else {
recurse(@_, $_) for 0 .. ($depth1);
}
}
sub shuffle
{
my($deck, $rand) = @_;
my $i = @$deck;
while ($i) {
my $j = shift @$rand;
@$deck[$i,$j] = @$deck[$j,$i];
}
}
recurse;
for (sort {$results{$b} <=> $results{$a}} keys %results) {
printf "%10d %s\n", $results{$_}, $_;
}
With the above code, I was able to determine that with a deck size of 5, and an initial set of 1 2 3 4 5, there is three times the probability that the resulting set will be 3 1 2 5 4 than the probability that the resulting set will be 2 3 4 5 1. To me, this indicates that this theory is flawed.
If anybody needs to prove to themselves that the test is exhaustive, print out "@$rand" in the shuffle subroutine.
Please analyze the code carefully, pull out your school books, and see if I have made a mistake.
Cheers,
mark
Re: FisherYates theory... does this prove that it is invalid? by jsprat (Curate) on Jul 25, 2003 at 01:51 UTC 
Hi MarkM,
That algorithm shows the possible results of a biased shuffle, not a FisherYates shuffle. The random sequence generated would not be 00000 to 44444, it would be 0000 to 4321 (a five digit shuffle requires 4 iterations  the faq goes 5, but the last never swaps  with each iteration shuffling one less item). The while loop in shuffle needs one less iteration, and a minor adjustment to recurse would look like this:
#!/usr/bin/perl
use strict;
use warnings;
my $depth = 4;
my %results;
sub recurse
{
if (@_ == $depth) {
shift; #discard $num
my @deck = (1 .. $depth);
shuffle(\@deck, [@_]);
$results{join('', @deck)}++;
} else {
my $num = shift  $depth  1;
# one less element each iteration
recurse($num, @_, $_) for 0 .. $num;
}
}
sub shuffle
{
my($deck, $rand) = @_;
my $i = @$deck;
# uncomment the following line
# print "@$rand\n";
# predecrement $i instead of post  the last would be a noop in
+this case
while ($i) {
my $j = shift @$rand;
@$deck[$i,$j] = @$deck[$j,$i];
}
}
recurse;
for (sort {$results{$b} <=> $results{$a}} keys %results) {
printf "%10d %s\n", $results{$_}, $_;
}
Here are the results of the modifications, using 4 elements instead of 5 (only 24 possible permutations instead of 120  makes the node much more readable ;):
Each possible permutation is shown exactly one time, for a possibility of being selected 1 out of 24 times (assuming a perfect rng). Makes sense??? Update: I followed BrowserUK's link below and in that thread there is a statement that elegantly describes the problem with a biased shuffle (When the Best Solution Isn't), by blakem: "It maps 8 paths to 6 end states". In this case, it's 3125 (5**5) paths to 120 (5!) end states  assuming 5 elements to be shuffled.  [reply] [d/l] [select] 
Re: FisherYates theory... does this prove that it is invalid? by BrowserUk (Pope) on Jul 25, 2003 at 02:04 UTC 
The problem is, your shuffle routine is not an implementation of a FisherYates shuffle.
This line
my $j = shift @$rand;
is no way equivalent to this line from the FAQ implementation
my $j = int rand ($i+1);
The latter picks a swap partner for the current value of $i, randomly between 0 and $i1. I can't quite wrap my brain around what your code is doing here, but it isn't even vaguely equivalent.
Therefore you are not testing a FisherYates shuffle, but some shuffle algorithm of your own invention, which you succeed in proving isn't as good as the FisherYates.
You might find this post Re: When the Best Solution Isn't that does a statistical analysis of several shuffle routines, a FisherYates amongst them, including frequency and standard deviation interesting.
Examine what is said, not who speaks.
"Efficiency is intelligent laziness." David Dunham
"When I'm working on a problem, I never think about beauty. I think only how to solve the problem. But when I have finished, if the solution is not beautiful, I know it is wrong." Richard Buckminster Fuller
 [reply] [d/l] [select] 
Re: FisherYates theory... does this prove that it is invalid? by adrianh (Chancellor) on Jul 25, 2003 at 01:58 UTC 
Please analyze the code carefully, pull out your school books, and see if I have made a mistake.
Yes you have. You're not emulating a FisherYates shuffle :)
Consider the original code from perlfaq:
sub fisher_yates_shuffle {
my $deck = shift; # $deck is a reference to an array
my $i = @$deck;
while ($i) {
print $i, "\n";
my $j = int rand ($i+1);
@$deck[$i,$j] = @$deck[$j,$i];
}
}
Note how $i is decremented on each iteration. Consider how that alters the sequence of possible indices.
Once you take that into account you get the textbook behaviour.
sub fixed_fisher_yates_shuffle {
my ($deck, $rand) = @_;
my $i = @$deck;
while ($i) {
my $j = shift @$rand;
@$deck[$i,$j] = @$deck[$j,$i];
}
}
use Set::CrossProduct;
my $i = Set::CrossProduct>new([ [0..4], [0..3], [0..2], [0..1], [0] ]
+);
my %count;
while (my $a = $i>get) {
print "@$a : ";
my @foo = (1,2,3,4,5);
fixed_fisher_yates_shuffle(\@foo, $a);
print "@foo\n";
$count{"@foo"}++;
};
foreach my $key (sort keys %count) {
print $key, " = ", $count{$key}, "\n";
};
 [reply] [d/l] [select] 

