Beefy Boxes and Bandwidth Generously Provided by pair Networks
The stupid question is the question not asked
 
PerlMonks  

Re^2: generate random number without using any built in function in perl?

by BrowserUk (Pope)
on May 08, 2013 at 23:50 UTC ( #1032702=note: print w/ replies, xml ) Need Help??


in reply to Re: generate random number without using any built in function in perl?
in thread generate random number without using any built in function in perl?

Why use pseudo-random when you can have random? :)


With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.


Comment on Re^2: generate random number without using any built in function in perl?
Re^3: generate random number without using any built in function in perl?
by davido (Archbishop) on May 09, 2013 at 05:17 UTC

    It's an interesting concept. If I understand (and please correct me if I'm wrong), you're using the unpredictability of the relationship between a few asynchronous threads as an entropy source, and then reducing that entropy to, in this case, a range of 0 .. 23, though it could just as easily be full bytes, or something else.

    When I run it with threaded Perl 5.14.2 on Ubuntu 13.04 (the only threaded Perl I have built at the moment), my distributions are not nearly so nice as yours. I keep seeing things like:

    { "0" => 3810, "2" => 5960, "16" => 230 }

    I have no idea why there's a difference. I fiddled with different numbers of threads, but that didn't produce appreciable improvements toward getting a smoother distribution.

    I also wonder if you need to consider modulo bias. Well, actually looking at the code again, I don't wonder, I'm pretty sure modulo bias can be an issue if $M is set to a number that doesn't divide evenly into 2^32.

    Initially I intended to test your code with A Perl adaptation of the FIPS-140-2 test, and A randomness dispersion test, but it doesn't take that sharp of an instrument to spot clusters when there are only three numbers drawn out of a possible 24 given thousands of picks.

    Any thoughts why this works for you and not for me?


    Dave

      a shot in the dark guess, it might be the size of IV, I mostly get one key, and once or twice every 20 runs two :)

      $ perl -V:ivsize ivsize='4'; $ perl fudge { 22 => 10000 } $ perl fudge { 4 => 8268, 11 => 1732 }

      First, the smiley in my post was intended to indicate a non-serious post.

      1. The OPs request makes no sense at all -- other than an interesting academic diversion; or a way to explore Perl -- when there are modules like Math::Random::MT available.

        Using the non-determinacy of threading as a source of entropy is an interesting academic exercise.

      2. Running 4 threads flat out to generate random numbers is ... shall we say 'overkill' :)
      When I run it with threaded Perl 5.14.2 on Ubuntu 13.04 (the only threaded Perl I have built at the moment), my distributions are not nearly so nice as yours. I keep seeing things like: { "0" => 3810, "2" => 5960, "16" => 230 }. I have no idea why there's a difference.

      I suspect the *nix threading primitives are the cause. Inserting a 10 second delay after the threads are spawned and before $rand is used might allow the threads to get up a running. But that is a guess. (This is not the only area of difference I've seen between threads on windows and *nix; the root of which lies beyond the scope of Perl.)

      I also wonder if you need to consider modulo bias. Well, actually looking at the code again, I don't wonder, I'm pretty sure modulo bias can be an issue if $M is set to a number that doesn't divide evenly into 2^32.

      Nope.

      Every PRNG in existance uses modulo operations; the statistics of doing so are thoroughly examined and well documented. Modulo operation bias on fixed-width (32/64-bit) random integer generators output is confined to the highest integer, and very minor. (ie. if mod 24, then 23 has a slightly less chance of being picked; but well within the statistical variation expected.)

      I've also tested this fairly extensively (on my system):

      C:\test>rand -M=7 -N=1e5 { "0" => 13948, 1 => 14522, 2 => 14752, 3 => 14208, 4 => 14032, 5 => 1 +4408, 6 => 14130 } C:\test>rand -M=10 -N=1e5 { "0" => 10417, 1 => 10198, 2 => 10158, 3 => 9546, 4 => 10126, 5 => 97 +64, 6 => 10274, 7 => 9456, 8 => 10155, 9 => 9906 } C:\test>rand -M=11 -N=1e5 { "0" => 9109, 1 => 9108, 2 => 9145, 3 => 8820, 4 => 9020, 5 => 9089, +6 => 9525, 7 => 9232, 8 => 9094, 9 => 9166, 10 => 8692 } C:\test>rand -M=19 -N=1e5 { "0" => 5265, 1 => 5638, 2 => 5218, 3 => 5211, 4 => 5226, 5 => 5231, +6 => 5382, 7 => 5259, 8 => 5129, 9 => 5377, 10 => 5180, 11 => 5780, 1 +2 => 5263, 13 => 5234, 14 => 4934, 15 => 5210, 16 => 5175, 17 => 5416 +, 18 => 4872 } c:\test>rand -M=97 -N=1e5 { "0" => 962, 1 => 1092, 2 => 991, 3 => 871, 4 => 1031, 5 => 979, 6 => + 1088, 7 => 931, 8 => 1056, 9 => 854, 10 => 998, 11 => 936, 12 => 101 +7, 13 => 1070, 14 => 1001, 15 => 1012, 16 => 1110, 17 => 934, 18 => 1 +073, 19 => 945, 20 => 887, 21 => 1032, 22 => 979, 23 => 935, 24 => 91 +6, 25 => 986, 26 => 1193, 27 => 1241, 28 => 1108, 29 => 1007, 30 => 1 +061, 31 => 946, 32 => 992, 33 => 947, 34 => 1160, 35 => 1106, 36 => 1 +220, 37 => 937, 38 => 942, 39 => 1067, 40 => 976, 41 => 1222, 42 => 1 +017, 43 => 1147, 44 => 877, 45 => 934, 46 => 865, 47 => 839, 48 => 12 +68, 49 => 1109, 50 => 1135, 51 => 1046, 52 => 1016, 53 => 1079, 54 => + 1055, 55 => 1212, 56 => 1043, 57 => 948, 58 => 1008, 59 => 998, 60 = +> 900, 61 => 923, 62 => 1095, 63 => 1010, 64 => 1068, 65 => 1252, 66 +=> 1054, 67 => 1046, 68 => 907, 69 => 1076, 70 => 940, 71 => 1075, 72 + => 1186, 73 => 1075, 74 => 1025, 75 => 1055, 76 => 1098, 77 => 985, +78 => 1090, 79 => 1151, 80 => 1059, 81 => , 87 => 1157, 88 => 921, 89 => 1045, 90 => 912, 91 => 1093, 92 => 1013 +, 93 => 1211, 94 => 1026, 95 => 1015, 96 => 1170 } C:\test>rand -M=100 -N=1e5 { "0" => 1029, 1 => 1110, 2 => 977, 3 => 1031, 4 => 1008, 5 => 1137, 6 + => 998, 7 => 1169, 8 => 915, 9 => 1039, 10 => 879, 11 => 1149, 12 => + 967, 13 => 972, 14 => 954, 15 => 967, 16 => 1219, 17 => 1009, 18 => +870, 19 => 858, 20 => 1005, 21 => 1077, 22 => 914, 23 => 930, 24 => 9 +87, 25 => 951, 26 => 979, 27 => 1029, 28 => 1056, 29 => 902, 30 => 96 +5, 31 => 810, 32 => 979, 33 => 1101, 34 => 1009, 35 => 1190, 36 => 86 +8, 37 => 1013, 38 => 981, 39 => 1104, 40 => 905, 41 => 993, 42 => 937 +, 43 => 1018, 44 => 1098, 45 => 1053, 46 => 898, 47 => 908, 48 => 994 +, 49 => 1135, 50 => 1174, 51 => 1010, 52 => 967, 53 => 1025, 54 => 99 +5, 55 => 1029, 56 => 875, 57 => 1129, 58 => 908, 59 => 976, 60 => 906 +, 61 => 922, 62 => 962, 63 => 931, 64 => 1044, 65 => 999, 66 => 980, +67 => 838, 68 => 973, 69 => 1140, 70 => 949, 71 => 965, 72 => 1044, 7 +3 => 940, 74 => 927, 75 => 1107, 76 => 925, 77 => 1045, 78 => 968, 79 + => 996, 80 => 890, 81 => 1123, 82 => 119 8 => 990, 89 => 966, 90 => 1072, 91 => 927, 92 => 915, 93 => 1093, 94 +=> 1063, 95 => 966, 96 => 1056, 97 => 994, 98 => 844, 99 => 1086 }

      And with FIPS-140-2:

      ... my $bytes = join '', map chr( unpack( 'V', $rand ) % 256 ), 1 .. $N; my $set = unpack '%32b*', $bytes; printf "In $N bytes, there are %d ones and %d zeros\n", $set, length( +$bytes ) *8 - $set; __END__ C:\test>rand -M=97 -N=1e5 In 1e5 bytes, there are 401004 ones and 398996 zeros C:\test>rand -M=97 -N=1e5 In 1e5 bytes, there are 400686 ones and 399314 zeros C:\test>rand -N=1e6 In 1e6 bytes, there are 4002738 ones and 3997262 zeros C:\test>rand -N=1e6 In 1e6 bytes, there are 3997975 ones and 4002025 zeros

      In this test, I extend that notion to runs of 1s of various lengths (the 5-sigma figure is equivalent to 21 heads in a row):

      C:\test>rand -N=1e6 ... my $bytes = join '', map chr( unpack( 'V', $rand ) % 256 ), 1 .. $N; my $set = unpack '%32b*', $bytes; printf "In $N bytes, there are %d ones and %d zeros\n", $set, length( +$bytes ) *8 - $set; use Statistics::Basic qw[ stddev ]; my @bytes; ++$bytes[ ord( substr $bytes, $_, 1 ) ] for 0 .. length( $bytes ) -1; #pp \@bytes; print "Stddev: ", stddev( @bytes ); my %runs; my $bits = unpack 'b*', $bytes; for my $n ( 2 .. 40 ) { ++$runs{ length() } for $bits =~ m[(?<=0)(1{$n})(?=0)]g; } pp \%runs; __END__ C:\test>rand -N=1e6 In 1e6 bytes, there are 4000292 ones and 3999708 zeros Stddev: 191.23 { 2 => 500474, 3 => 249154, 4 => 124655, 5 => 61747, 6 => 31528, 7 => +25885, 8 => 4672, 9 => 656, 10 => 175, 11 => 42, 12 => 17, 13 => 16, +14 => 18, 15 => 13, 16 => 91, 17 => 38, 18 => 21, 19 => 13, 20 => 10, + 21 => 6, 22 => 5, 23 => 9, 24 => 50, 25 => 23, 26 => 17, 27 => 5, 28 + => 6, 29 => 2, 30 => 2, 31 => 3, 32 => 37, 33 => 21, 34 => 9, 35 => +2, 36 => 2, 38 => 4, 39 => 5, 40 => 29 } C:\test>rand -N=1e6 In 1e6 bytes, there are 3991425 ones and 4008575 zeros Stddev: 197.76 { 2 => 500058, 3 => 248757, 4 => 124893, 5 => 62380, 6 => 30268, 7 => +24800, 8 => 4693, 9 => 582, 10 => 153, 11 => 59, 12 => 26, 13 => 13, +14 => 10, 15 => 11, 16 => 108, 17 => 45, 18 => 14, 19 => 11, 20 => 5, + 21 => 8, 22 => 7, 23 => 2, 24 => 46, 25 => 13, 26 => 10, 27 => 6, 28 + => 5, 29 => 3, 30 => 3, 31 => 1, 32 => 50, 33 => 10, 34 => 6, 35 => +3, 36 => 5, 37 => 3, 38 => 1, 39 => 1, 40 => 40 }

      If you want to plot those out, you'll find that all those numbers correspond uncannily with statistical expectations.

      Indeed, if you can find a test -- preferably one that doesn't require a Masters in *nixisms for me to get to run on my system -- I'll run it.

      It is, to the best of my ability to test it. truly random. Just horribly expensive for a PRNG! :)


      With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://1032702]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others cooling their heels in the Monastery: (6)
As of 2014-08-30 15:24 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    The best computer themed movie is:











    Results (293 votes), past polls