|Problems? Is your data what you think it is?|
Testing for randomnessby DrHyde (Prior)
|on Oct 23, 2003 at 14:01 UTC||Need Help??|
DrHyde has asked for the
wisdom of the Perl Monks concerning the following question:
I recently uploaded Net::Random to the CPAN. It gathers data from a couple of online sources of truly random data (which I trust to really *be* random, that's not the issue here), and uses that to generate random numbers in the user's chosen range. For instance, you might want a bunch of random 0s and 1s to simulate tossing a coin, or random numbers from 1 to 6 to simulate a die roll.
Given that I trust the original data to be random, I still need to be sure that what I'm doing to the data isn't biassing it.
Such bias could be introduced in various ways, the two I can think of off the top of my head are:
The question is, then, how to test that my output data is nice and random? I initially thought of using Jon Orwant's Statistics::ChiSquared module, but that has a couple of big drawbacks:
I'm not aware of anything on CPAN that can do that. An alternative would be - and we can do this because I'm only concerned about whether *I* am introducing bias, not with whether the data is biassed - to check that the distribution of my results is the same as the distribution of the original data. But I'm not aware of anything to do that either.
So, can anyone point me at any appropriate modules? Or at an algorithm that I could turn into a module?