Beefy Boxes and Bandwidth Generously Provided by pair Networks
Your skill will accomplish
what the force of many cannot
 
PerlMonks  

Monte Carlo - Coin Toss

by James_H (Acolyte)
on Mar 09, 2011 at 22:20 UTC ( [id://892293]=CUFP: print w/replies, xml ) Need Help??

I hope this is appropriate for this section

Recently, I was asked to find the probability density function of distribution (pdf) of tossing 20 coins at a time. The time it took me to write a Monte Carlo simulation was half the time it took me to compute it from the binomial dist or by building a Pascal's triangle. Maybe thats not a good reflection on my math, but as you will see, I'm not a fancy Perl programmer either :)

Anyway, here is the code which produces very accurate results in a matter of 30sec on my modest laptop.

I added mind numbing comments for utter beginners. Enjoy !

Regards, James

# coinToss.pl # # Toss a collection (20 in this example) of coins a large number of ti +mes # to determine the distribution of Heads and Tails. # N.B. P(Head) = P(Tail) = 50% = 0.5 use strict; # Inputs my $numTosses = 20; # The num of coin tosses per experiment (20 + in this example) my $runs = 10000000; # 10 Million - the num of times we repeat the +experiment # Program vars my $i; # a looping variable my $j; # another looping var my $toss; # Keeps running total of the number of 'Heads' during + current experiment my @collect; # An array that keeps a total of the number of 'Heads' + counted # in all previous experiment. my $percent; # To convert $collect[0] - $collect[19] to % # Outer loop: Repeat "$runs" times for ($j = 0; $j < $runs; $j++) { # Inner loop: One run of 20 tosses for ($i = 0; $i < $numTosses; $i++) { $toss += (rand() < 0.5) ? 1 : 0; #print "$toss\n"; } $collect[$toss]++; $toss = 0; } # Print results print "\nTails\tCount out of $runs\t%\n"; for ($i = 0; $i < $numTosses+1; $i++) { $percent = sprintf "%.2f", $collect[$i] / $runs * 100; print "$i\t$collect[$i]\t\t\t$percent%\n" }

Replies are listed 'Best First'.
Re: Monte Carlo - Coin Toss
by johngg (Canon) on Mar 11, 2011 at 15:05 UTC

    That is an interesting problem and ++ for your effective solution.

    I hope you don't mind me modifying your script to make it a little more Perlish. It is more idiomatic (and often simpler) to use Perl-style rather than C-style loops. It is also, I think, preferable to declare variables inside the scope where they will be used rather than at the beginning of the script which, in effect, makes them global. I have renamed $toss to $tailsCt as I think that better describes its purpose. I have employed constants for number of runs and tosses just to illustrate their use. I also have added plenty of comments to clarify what I have done.

    # Employ strictures and warnings. # use strict; use warnings; # Declare things we don't change as constants (by convention # use upper-case for these). # use constant { NUM_TOSSES => 20, RUNS => 10_000_000, }; # Initialise all possible elements of @collect to avoid # "uninitialised" warnings at the printing stage if, for # example, no "twenties" were thrown. # my @collect = map 0, 0 .. NUM_TOSSES; # Use Perl-style for or foreach loops rather than C-style # ones. The for and foreach are synonymous in Perl and can # be freely interchanged. The loop variable is, by default, # available in $_ or you can assign it to a lexically # scoped variable as I do further down when printing the # results. We don't need it in these nested loops. # for ( 1 .. RUNS ) { # Declare lexically scoped $tailsCt so a new variable is # brought into existence each time round the outer loop # and will go out of scope on loop exit, thus no need to # zeroise every time. # my $tailsCt; # Toss the coin NUM_TOSSES times. # for ( 1 .. NUM_TOSSES ) { # Give rand() an argument n and it will return a value # such that 0 <= value < n. Use int to truncate so # that $tailsCt is incremented by 0 or 1 (heads or # tails respectively). # $tailsCt += int rand 2; } $collect[ $tailsCt ] ++; } # Rather than trying to line things up with tabs it might be # easier to use printf and employ the same field widths in the # heading and data rows. The RUNS constant is, in effect, a # piece of code rather than a variable so we have to enclose # it in a ${ \ ... } or a @{ [ ... ] } construct to interpolate # it into a double-quoted string. # printf qq{%5s%25s%9s\n}, q{Tails}, qq{Count out of ${ \ RUNS }}, q{%ge}; # q{Tails}, qq{Count out of @{ [ RUNS ] }}, q{%ge}; also works # Loop over the possible results from all heads to all tails # using lexically scoped $tailsCt to access elements in the # @collect array. This $tailsCt is a different variable to the # $tailsCt in the for ( 1 .. RUNS ) { ... } loop above and is # only in existence within the scope of this loop. # foreach my $tailsCt ( 0 .. NUM_TOSSES ) { # Rather than using a temporary variable and sprintf you # can just use printf directly and do the calculation in # the argument list. # printf qq{%5d%25d%9.2f\n}, $tailsCt, $collect[ $tailsCt ], $collect[ $tailsCt ] / RUNS * 100; }

    I hope these observations are useful.

    Cheers,

    JohnGG

      It is more idiomatic (and often simpler) to use Perl-style rather than C-style loops.

      Yes, but it won't work if the numbers are too large for integers and have to use floating point.    Then only C style loops will work.

      my @collect = map 0, 0 .. NUM_TOSSES;

      Why are you initializing @collect with NUM_TOSSES + 1 elements?

      The idiomatic way to initialize @collect with NUM_TOSSES elements is usually:

      my @collect = ( 0 ) x NUM_TOSSES;
      for ($i = 0; $i < $numTosses+1; $i++) { foreach my $tailsCt ( 0 .. NUM_TOSSES )

      You have an off-by-one error, it should be:

      foreach my $tailsCt ( 0 .. NUM_TOSSES - 1 )

      Update: oops, I misread the original code.

        Why are you initializing @collect with NUM_TOSSES + 1 elements?

        The idiomatic way to initialize @collect with NUM_TOSSES elements is usually:

        my @collect = ( 0 ) x NUM_TOSSES;

        As Eliya points out, 20 tosses could give 21 outcomes as regards the number of "tails" tossed, i.e. 0, 1, ... 19 or 20 so your way should be

        my @collect = ( 0 ) x ( NUM_TOSSES + 1 );

        and I would agree that your way is perhaps more idiomatic. However, I felt that

        my @collect = map 0, 0 .. NUM_TOSSES;

        better illustrated the relation between the numbers of tosses and outcomes. What do others think?

        Cheers,

        JohnGG

        You have an off-by-one error, it should be: ...

        No, 0 .. NUM_TOSSES is correct here, because adding up NUM_TOSSES (with a single result being 0 or 1) can potentially produce values ranging from zero to NUM_TOSSES.

        Tossing a coin once can produce 0 or 1 tails;
        tossing a coin twice can produce 0, 1 or 2 tails;
        etc.

        Yes, but it won't work if the numbers are too large for integers and have to use floating point.

        If 2 billion loop iterations are not enough for you, you can always nest loops to multiple the range.

      Many thanks JohnGG!

Re: Monte Carlo - Coin Toss
by roboticus (Chancellor) on Mar 12, 2011 at 13:53 UTC

    James_H:

    Obviously your "modest laptop" is a bit zippier than my Atom-based desktop, as it took 3:10 to complete. But when I looked at your code, I was wondering why you did a monte-carlo simulation. Since there are only 20 coins, there are a little over a million combinations, so why not compute it exactly?

    So I whipped this up:

    $ cat 892293.pl use strict; use warnings; my $numTosses = 20; # Coin tosses per experiment (20 in this example) my $runs = (1<<$numTosses)-1; my @tailCnt; for (my $collection=0; $collection<1<<$numTosses; ++$collection) { my $tails=0; $tails += ($collection & 1<<$_) ? 1 : 0 for 0 .. $numTosses; $tailCnt[$tails]++; } print <<EOHDR; Tails Count % ----- -------- ------ EOHDR for (my $i = 0; $i < $numTosses+1; $i++) { printf "% 4u % 8u %5.2f\n", $i, $tailCnt[$i], 100*$tailCnt[$i]/$runs; } $ time perl 892293.pl 0x00100000, 1048576 Tails Count % ----- -------- ------ 0 1 0.00 1 20 0.00 2 190 0.02 3 1140 0.11 4 4845 0.46 5 15504 1.48 6 38760 3.70 7 77520 7.39 8 125970 12.01 9 167960 16.02 10 184756 17.62 11 167960 16.02 12 125970 12.01 13 77520 7.39 14 38760 3.70 15 15504 1.48 16 4845 0.46 17 1140 0.11 18 190 0.02 19 20 0.00 20 1 0.00 real 0m20.372s user 0m20.273s sys 0m0.016s

    It runs quite a bit faster on my machine, and it tried each combination once. Of course, I'm limited by the number of bits available, so large experiments aren't going to work well. You can use Bit::Vector, but I suspect that would be a good deal slower. (I wouldn't know, I didn't try it.)

    ...roboticus

    When your only tool is a hammer, all problems look like your thumb.

      ...find the probability density function of distribution (pdf) of tossing 20 coins at a time.

      Let me add a pedantic point. When asked to find a PDF (e.g., in a statistical theory class), the requester is likely to mean that the function should be found analytically. That said, it is equally valid to compute it exactly by enumeration (as roboticus did).

      Someone looking for a Monte Carlo approach will typically ask you to estimate the PDF.

      You can use Bit::Vector, but I suspect that would be a good deal slower.

      There wouldn't be any need to do that. With 32-bit ints, a full run would take around 7 hours. With 64-bit ints, it would take around 3 million years.

      If you switched to using unpack to count the bits:

      $tailCnt[ unpack '%32b*', pack 'N', $collection ]++;

      Then the full 32-bit comes down to a very reasonable 40 minutes, but that still leaves the full 64-bit requiring several hundred thousand years.


      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.

        BrowserUk:

        Heh, yep, that would be a good deal slower, all right! Of course, if speed gets to be an issue, you can always skip counting all the iterations, and directly compute the binomial expansion:

        use strict; use warnings; use List::Util qw(reduce); my $numTosses = 20; #my $runs = (1<<$numTosses)-1; my @triangle = (0, 1, 0); for (1 .. $numTosses) { my @newTriangle=(0); push @newTriangle, $triangle[$_]+$triangle[$_+1] for 0 .. $#triang +le-1; push @newTriangle, 0; @triangle = @newTriangle; } print <<EOHDR; Tails Count % ----- ---------- ------ EOHDR my $runs = reduce { $a + $b } @triangle; for (my $i = 0; $i < $numTosses+1; $i++) { printf "% 4u % 10u %5.2f\n", $i, $triangle[$i+1], 100*$triangle[$i+1]/$runs; }

        It runs a good bit faster:

        $ time perl 892293.pl Name "main::a" used only once: possible typo at 892293.pl line 21. Name "main::b" used only once: possible typo at 892293.pl line 21. Tails Count % ----- ---------- ------ 0 1 0.00 1 20 0.00 2 190 0.02 3 1140 0.11 4 4845 0.46 5 15504 1.48 6 38760 3.70 7 77520 7.39 8 125970 12.01 9 167960 16.02 10 184756 17.62 11 167960 16.02 12 125970 12.01 13 77520 7.39 14 38760 3.70 15 15504 1.48 16 4845 0.46 17 1140 0.11 18 190 0.02 19 20 0.00 20 1 0.00 real 0m0.034s user 0m0.024s sys 0m0.012s

        Even if you use 32 bits:

        $ time perl 892293.pl Name "main::a" used only once: possible typo at 892293.pl line 21. Name "main::b" used only once: possible typo at 892293.pl line 21. Tails Count % ----- ---------- ------ 0 1 0.00 1 32 0.00 2 496 0.00 3 4960 0.00 4 35960 0.00 5 201376 0.00 6 906192 0.02 7 3365856 0.08 8 10518300 0.24 9 28048800 0.65 10 64512240 1.50 11 129024480 3.00 12 225792840 5.26 13 347373600 8.09 14 471435600 10.98 15 565722720 13.17 16 601080390 13.99 17 565722720 13.17 18 471435600 10.98 19 347373600 8.09 20 225792840 5.26 21 129024480 3.00 22 64512240 1.50 23 28048800 0.65 24 10518300 0.24 25 3365856 0.08 26 906192 0.02 27 201376 0.00 28 35960 0.00 29 4960 0.00 30 496 0.00 31 32 0.00 32 1 0.00 real 0m0.034s user 0m0.028s sys 0m0.004s

        ...roboticus

        When your only tool is a hammer, all problems look like your thumb.

        Update: When I use bignum;, it prints the PDF for 64 tosses per run in just over a second, but I haven't got it formatted nicely...

        use strict; use warnings; use List::Util qw(reduce); use bignum; my $numTosses = 64; #my $runs = (1<<$numTosses)-1; my @triangle = (0, 1, 0); for (1 .. $numTosses) { my @newTriangle=(0); push @newTriangle, $triangle[$_]+$triangle[$_+1] for 0 .. $#triang +le-1; push @newTriangle, 0; @triangle = @newTriangle; } print <<EOHDR; Tails Count % ----- ---------- ------ EOHDR my $runs = reduce { $a + $b } @triangle; for (my $i = 0; $i < $numTosses+1; $i++) { print "$i\t$triangle[$i+1]\t",100*$triangle[$i+1]/$runs, "\n"; }

        I guess I'll have to figure out how to make bugnum and printf cooperate better...

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: CUFP [id://892293]
Approved by ww
Front-paged by Arunbear
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others about the Monastery: (4)
As of 2024-03-19 06:45 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found