I will report back chi-squared results using R
It will be interesting to see the results from a known good source.
Because I think that S::CS is (fatally) flawed. To get some feel for the accuracy of the test it performs, I decided to run it on the shuffle using the known good MT PRNG and a small dataset (1..4) a good number of times to see how consistent the results S::CS were; and the answer is not just "not very", but actually just "not":
#! perl -slw
use strict;
use Statistics::ChiSquare qw[ chisquare ];
use Math::Random::MT;
use Data::Dump qw[ pp ];
my $mt = Math::Random::MT->new();
our $N //= 1e6;
our $ASIZE //= 4;
our $T //= 4;
sub shuffle {
$a = $_ + $mt->rand( @_ - $_ ),
$b = $_[$_],
$_[$_] = $_[$a],
$_[$a] = $b
for 0 .. $#_;
return @_;
}
my @data = ( 1 .. $ASIZE );
my @chi;
for( 1 .. $T ) {
my %tests;
++$tests{ join '', shuffle( @data ) } for 1 .. $N;
print chisquare( values %tests );
}
__END__
C:\test>chiSquareChiSquare -ASIZE=4 -N=1e4 -T=100
There's a >25% chance, and a <50% chance, that this data is random.
There's a >50% chance, and a <75% chance, that this data is random.
There's a >10% chance, and a <25% chance, that this data is random.
There's a >10% chance, and a <25% chance, that this data is random.
There's a >50% chance, and a <75% chance, that this data is random.
There's a >10% chance, and a <25% chance, that this data is random.
There's a >25% chance, and a <50% chance, that this data is random.
There's a >75% chance, and a <90% chance, that this data is random.
There's a >25% chance, and a <50% chance, that this data is random.
There's a >50% chance, and a <75% chance, that this data is random.
There's a >5% chance, and a <10% chance, that this data is random.
There's a >75% chance, and a <90% chance, that this data is random.
There's a >10% chance, and a <25% chance, that this data is random.
There's a >50% chance, and a <75% chance, that this data is random.
There's a >75% chance, and a <90% chance, that this data is random.
There's a >75% chance, and a <90% chance, that this data is random.
There's a >50% chance, and a <75% chance, that this data is random.
There's a >50% chance, and a <75% chance, that this data is random.
There's a >5% chance, and a <10% chance, that this data is random.
There's a >95% chance, and a <99% chance, that this data is random.
There's a >1% chance, and a <5% chance, that this data is random.
79 more utterly inconsistent results:
Given this is a known good algorithm using a known good PRNG, all in all, and as I said earlier, I think that is as good a definition of random as I've seen a module produce as its results.
With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
In the absence of evidence, opinion is indistinguishable from prejudice.
Suck that fhit