First, the smiley in my post was intended to indicate a non-serious post.
- The OPs request makes no sense at all -- other than an interesting academic diversion; or a way to explore Perl -- when there are modules like Math::Random::MT available.
Using the non-determinacy of threading as a source of entropy is an interesting academic exercise.
- Running 4 threads flat out to generate random numbers is ... shall we say 'overkill' :)
When I run it with threaded Perl 5.14.2 on Ubuntu 13.04 (the only threaded Perl I have built at the moment), my distributions are not nearly so nice as yours. I keep seeing things like: { "0" => 3810, "2" => 5960, "16" => 230 }. I have no idea why there's a difference.
I suspect the *nix threading primitives are the cause. Inserting a 10 second delay after the threads are spawned and before $rand is used might allow the threads to get up a running. But that is a guess. (This is not the only area of difference I've seen between threads on windows and *nix; the root of which lies beyond the scope of Perl.)
I also wonder if you need to consider modulo bias. Well, actually looking at the code again, I don't wonder, I'm pretty sure modulo bias can be an issue if $M is set to a number that doesn't divide evenly into 2^32.
Nope. Every PRNG in existance uses modulo operations; the statistics of doing so are thoroughly examined and well documented. Modulo operation bias on fixed-width (32/64-bit) random integer generators output is confined to the highest integer, and very minor. (ie. if mod 24, then 23 has a slightly less chance of being picked; but well within the statistical variation expected.)
I've also tested this fairly extensively (on my system): C:\test>rand -M=7 -N=1e5
{ "0" => 13948, 1 => 14522, 2 => 14752, 3 => 14208, 4 => 14032, 5 => 1
+4408, 6 => 14130 }
C:\test>rand -M=10 -N=1e5
{ "0" => 10417, 1 => 10198, 2 => 10158, 3 => 9546, 4 => 10126, 5 => 97
+64, 6 => 10274, 7 => 9456, 8 => 10155, 9 => 9906 }
C:\test>rand -M=11 -N=1e5
{ "0" => 9109, 1 => 9108, 2 => 9145, 3 => 8820, 4 => 9020, 5 => 9089,
+6 => 9525, 7 => 9232, 8 => 9094, 9 => 9166, 10 => 8692 }
C:\test>rand -M=19 -N=1e5
{ "0" => 5265, 1 => 5638, 2 => 5218, 3 => 5211, 4 => 5226, 5 => 5231,
+6 => 5382, 7 => 5259, 8 => 5129, 9 => 5377, 10 => 5180, 11 => 5780, 1
+2 => 5263, 13 => 5234, 14 => 4934, 15 => 5210, 16 => 5175, 17 => 5416
+, 18 => 4872 }
c:\test>rand -M=97 -N=1e5
{ "0" => 962, 1 => 1092, 2 => 991, 3 => 871, 4 => 1031, 5 => 979, 6 =>
+ 1088, 7 => 931, 8 => 1056, 9 => 854, 10 => 998, 11 => 936, 12 => 101
+7, 13 => 1070, 14 => 1001, 15 => 1012, 16 => 1110, 17 => 934, 18 => 1
+073, 19 => 945, 20 => 887, 21 => 1032, 22 => 979, 23 => 935, 24 => 91
+6, 25 => 986, 26 => 1193, 27 => 1241, 28 => 1108, 29 => 1007, 30 => 1
+061, 31 => 946, 32 => 992, 33 => 947, 34 => 1160, 35 => 1106, 36 => 1
+220, 37 => 937, 38 => 942, 39 => 1067, 40 => 976, 41 => 1222, 42 => 1
+017, 43 => 1147, 44 => 877, 45 => 934, 46 => 865, 47 => 839, 48 => 12
+68, 49 => 1109, 50 => 1135, 51 => 1046, 52 => 1016, 53 => 1079, 54 =>
+ 1055, 55 => 1212, 56 => 1043, 57 => 948, 58 => 1008, 59 => 998, 60 =
+> 900, 61 => 923, 62 => 1095, 63 => 1010, 64 => 1068, 65 => 1252, 66
+=> 1054, 67 => 1046, 68 => 907, 69 => 1076, 70 => 940, 71 => 1075, 72
+ => 1186, 73 => 1075, 74 => 1025, 75 => 1055, 76 => 1098, 77 => 985,
+78 => 1090, 79 => 1151, 80 => 1059, 81 =>
, 87 => 1157, 88 => 921, 89 => 1045, 90 => 912, 91 => 1093, 92 => 1013
+, 93 => 1211, 94 => 1026, 95 => 1015, 96 => 1170 }
C:\test>rand -M=100 -N=1e5
{ "0" => 1029, 1 => 1110, 2 => 977, 3 => 1031, 4 => 1008, 5 => 1137, 6
+ => 998, 7 => 1169, 8 => 915, 9 => 1039, 10 => 879, 11 => 1149, 12 =>
+ 967, 13 => 972, 14 => 954, 15 => 967, 16 => 1219, 17 => 1009, 18 =>
+870, 19 => 858, 20 => 1005, 21 => 1077, 22 => 914, 23 => 930, 24 => 9
+87, 25 => 951, 26 => 979, 27 => 1029, 28 => 1056, 29 => 902, 30 => 96
+5, 31 => 810, 32 => 979, 33 => 1101, 34 => 1009, 35 => 1190, 36 => 86
+8, 37 => 1013, 38 => 981, 39 => 1104, 40 => 905, 41 => 993, 42 => 937
+, 43 => 1018, 44 => 1098, 45 => 1053, 46 => 898, 47 => 908, 48 => 994
+, 49 => 1135, 50 => 1174, 51 => 1010, 52 => 967, 53 => 1025, 54 => 99
+5, 55 => 1029, 56 => 875, 57 => 1129, 58 => 908, 59 => 976, 60 => 906
+, 61 => 922, 62 => 962, 63 => 931, 64 => 1044, 65 => 999, 66 => 980,
+67 => 838, 68 => 973, 69 => 1140, 70 => 949, 71 => 965, 72 => 1044, 7
+3 => 940, 74 => 927, 75 => 1107, 76 => 925, 77 => 1045, 78 => 968, 79
+ => 996, 80 => 890, 81 => 1123, 82 => 119
8 => 990, 89 => 966, 90 => 1072, 91 => 927, 92 => 915, 93 => 1093, 94
+=> 1063, 95 => 966, 96 => 1056, 97 => 994, 98 => 844, 99 => 1086 }
And with FIPS-140-2:
...
my $bytes = join '', map chr( unpack( 'V', $rand ) % 256 ), 1 .. $N;
my $set = unpack '%32b*', $bytes;
printf "In $N bytes, there are %d ones and %d zeros\n", $set, length(
+$bytes ) *8 - $set;
__END__
C:\test>rand -M=97 -N=1e5
In 1e5 bytes, there are 401004 ones and 398996 zeros
C:\test>rand -M=97 -N=1e5
In 1e5 bytes, there are 400686 ones and 399314 zeros
C:\test>rand -N=1e6
In 1e6 bytes, there are 4002738 ones and 3997262 zeros
C:\test>rand -N=1e6
In 1e6 bytes, there are 3997975 ones and 4002025 zeros
In this test, I extend that notion to runs of 1s of various lengths (the 5-sigma figure is equivalent to 21 heads in a row): C:\test>rand -N=1e6
...
my $bytes = join '', map chr( unpack( 'V', $rand ) % 256 ), 1 .. $N;
my $set = unpack '%32b*', $bytes;
printf "In $N bytes, there are %d ones and %d zeros\n", $set, length(
+$bytes ) *8 - $set;
use Statistics::Basic qw[ stddev ];
my @bytes;
++$bytes[ ord( substr $bytes, $_, 1 ) ] for 0 .. length( $bytes ) -1;
#pp \@bytes;
print "Stddev: ", stddev( @bytes );
my %runs;
my $bits = unpack 'b*', $bytes;
for my $n ( 2 .. 40 ) {
++$runs{ length() } for $bits =~ m[(?<=0)(1{$n})(?=0)]g;
}
pp \%runs;
__END__
C:\test>rand -N=1e6
In 1e6 bytes, there are 4000292 ones and 3999708 zeros
Stddev: 191.23
{ 2 => 500474, 3 => 249154, 4 => 124655, 5 => 61747, 6 => 31528, 7 =>
+25885, 8 => 4672, 9 => 656, 10 => 175, 11 => 42, 12 => 17, 13 => 16,
+14 => 18, 15 => 13, 16 => 91, 17 => 38, 18 => 21, 19 => 13, 20 => 10,
+ 21 => 6, 22 => 5, 23 => 9, 24 => 50, 25 => 23, 26 => 17, 27 => 5, 28
+ => 6, 29 => 2, 30 => 2, 31 => 3, 32 => 37, 33 => 21, 34 => 9, 35 =>
+2, 36 => 2, 38 => 4, 39 => 5, 40 => 29 }
C:\test>rand -N=1e6
In 1e6 bytes, there are 3991425 ones and 4008575 zeros
Stddev: 197.76
{ 2 => 500058, 3 => 248757, 4 => 124893, 5 => 62380, 6 => 30268, 7 =>
+24800, 8 => 4693, 9 => 582, 10 => 153, 11 => 59, 12 => 26, 13 => 13,
+14 => 10, 15 => 11, 16 => 108, 17 => 45, 18 => 14, 19 => 11, 20 => 5,
+ 21 => 8, 22 => 7, 23 => 2, 24 => 46, 25 => 13, 26 => 10, 27 => 6, 28
+ => 5, 29 => 3, 30 => 3, 31 => 1, 32 => 50, 33 => 10, 34 => 6, 35 =>
+3, 36 => 5, 37 => 3, 38 => 1, 39 => 1, 40 => 40 }
If you want to plot those out, you'll find that all those numbers correspond uncannily with statistical expectations.
Indeed, if you can find a test -- preferably one that doesn't require a Masters in *nixisms for me to get to run on my system -- I'll run it.
It is, to the best of my ability to test it. truly random. Just horribly expensive for a PRNG! :)
With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.
| [reply] [d/l] [select] |