Syntactic Confectionery Delight PerlMonks

### Monte Carlo - Coin Toss

by James_H (Acolyte)
 on Mar 09, 2011 at 22:20 UTC ( #892293=CUFP: print w/replies, xml ) Need Help??

I hope this is appropriate for this section

Recently, I was asked to find the probability density function of distribution (pdf) of tossing 20 coins at a time. The time it took me to write a Monte Carlo simulation was half the time it took me to compute it from the binomial dist or by building a Pascal's triangle. Maybe thats not a good reflection on my math, but as you will see, I'm not a fancy Perl programmer either :)

Anyway, here is the code which produces very accurate results in a matter of 30sec on my modest laptop.

Regards, James

```# coinToss.pl
#
# Toss a collection (20 in this example) of coins a large number of ti
+mes
# to determine the distribution of Heads and Tails.
# N.B. P(Head) = P(Tail) = 50% = 0.5

use strict;

# Inputs
my \$numTosses = 20;        # The num of coin tosses per experiment (20
+ in this example)
my \$runs = 10000000;    # 10 Million - the num of times we repeat the
+experiment

# Program vars
my \$i;            # a looping variable
my \$j;            # another looping var
my \$toss;        # Keeps running total of the number of 'Heads' during
+ current experiment
my @collect;    # An array that keeps a total of the number of 'Heads'
+ counted
# in all previous experiment.
my \$percent;    # To convert \$collect[0] - \$collect[19] to %

# Outer loop: Repeat "\$runs" times
for (\$j = 0; \$j < \$runs; \$j++) {
# Inner loop: One run of 20 tosses
for (\$i = 0; \$i < \$numTosses; \$i++) {
\$toss += (rand() < 0.5) ? 1 : 0;
#print "\$toss\n";
}
\$collect[\$toss]++;
\$toss = 0;
}

# Print results
print "\nTails\tCount out of \$runs\t%\n";
for (\$i = 0; \$i < \$numTosses+1; \$i++) {
\$percent = sprintf "%.2f", \$collect[\$i] / \$runs * 100;
print "\$i\t\$collect[\$i]\t\t\t\$percent%\n"
}

Replies are listed 'Best First'.
Re: Monte Carlo - Coin Toss
by johngg (Abbot) on Mar 11, 2011 at 15:05 UTC

That is an interesting problem and ++ for your effective solution.

I hope you don't mind me modifying your script to make it a little more Perlish. It is more idiomatic (and often simpler) to use Perl-style rather than C-style loops. It is also, I think, preferable to declare variables inside the scope where they will be used rather than at the beginning of the script which, in effect, makes them global. I have renamed \$toss to \$tailsCt as I think that better describes its purpose. I have employed constants for number of runs and tosses just to illustrate their use. I also have added plenty of comments to clarify what I have done.

```# Employ strictures and warnings.
#
use strict;
use warnings;

# Declare things we don't change as constants (by convention
# use upper-case for these).
#
use constant {
NUM_TOSSES => 20,
RUNS       => 10_000_000,
};

# Initialise all possible elements of @collect to avoid
# "uninitialised" warnings at the printing stage if, for
# example, no "twenties" were thrown.
#
my @collect = map 0, 0 .. NUM_TOSSES;

# Use Perl-style for or foreach loops rather than C-style
# ones. The for and foreach are synonymous in Perl and can
# be freely interchanged. The loop variable is, by default,
# available in \$_ or you can assign it to a lexically
# scoped variable as I do further down when printing the
# results. We don't need it in these nested loops.
#
for ( 1 .. RUNS )
{
# Declare lexically scoped \$tailsCt so a new variable is
# brought into existence each time round the outer loop
# and will go out of scope on loop exit, thus no need to
# zeroise every time.
#
my \$tailsCt;

# Toss the coin NUM_TOSSES times.
#
for ( 1 .. NUM_TOSSES )
{
# Give rand() an argument n and it will return a value
# such that 0 <= value < n. Use int to truncate so
# that \$tailsCt is incremented by 0 or 1 (heads or
# tails respectively).
#
\$tailsCt += int rand 2;
}
\$collect[ \$tailsCt ] ++;
}

# Rather than trying to line things up with tabs it might be
# easier to use printf and employ the same field widths in the
# heading and data rows. The RUNS constant is, in effect, a
# piece of code rather than a variable so we have to enclose
# it in a \${ \ ... } or a @{ [ ... ] } construct to interpolate
# it into a double-quoted string.
#
printf qq{%5s%25s%9s\n},
q{Tails}, qq{Count out of \${ \ RUNS }}, q{%ge};
#  q{Tails}, qq{Count out of @{ [ RUNS ] }}, q{%ge}; also works

# Loop over the possible results from all heads to all tails
# using lexically scoped \$tailsCt to access elements in the
# @collect array. This \$tailsCt is a different variable to the
# \$tailsCt in the for ( 1 .. RUNS ) { ... } loop above and is
# only in existence within the scope of this loop.
#
foreach my \$tailsCt ( 0 .. NUM_TOSSES )
{
# Rather than using a temporary variable and sprintf you
# can just use printf directly and do the calculation in
# the argument list.
#
printf qq{%5d%25d%9.2f\n},
\$tailsCt,
\$collect[ \$tailsCt ],
\$collect[ \$tailsCt ] / RUNS * 100;
}

I hope these observations are useful.

Cheers,

JohnGG

It is more idiomatic (and often simpler) to use Perl-style rather than C-style loops.

Yes, but it won't work if the numbers are too large for integers and have to use floating point.    Then only C style loops will work.

```my @collect = map 0, 0 .. NUM_TOSSES;

Why are you initializing @collect with NUM_TOSSES + 1 elements?

The idiomatic way to initialize @collect with NUM_TOSSES elements is usually:

```my @collect = ( 0 ) x NUM_TOSSES;
```for (\$i = 0; \$i < \$numTosses+1; \$i++) {

foreach my \$tailsCt ( 0 .. NUM_TOSSES )

You have an off-by-one error, it should be:

```foreach my \$tailsCt ( 0 .. NUM_TOSSES - 1 )

Update: oops, I misread the original code.

Why are you initializing @collect with NUM_TOSSES + 1 elements?

The idiomatic way to initialize @collect with NUM_TOSSES elements is usually:

```my @collect = ( 0 ) x NUM_TOSSES;

As Eliya points out, 20 tosses could give 21 outcomes as regards the number of "tails" tossed, i.e. 0, 1, ... 19 or 20 so your way should be

```my @collect = ( 0 ) x ( NUM_TOSSES + 1 );

and I would agree that your way is perhaps more idiomatic. However, I felt that

```my @collect = map 0, 0 .. NUM_TOSSES;

better illustrated the relation between the numbers of tosses and outcomes. What do others think?

Cheers,

JohnGG

You have an off-by-one error, it should be: ...

No, 0 .. NUM_TOSSES is correct here, because adding up NUM_TOSSES (with a single result being 0 or 1) can potentially produce values ranging from zero to NUM_TOSSES.

Tossing a coin once can produce 0 or 1 tails;
tossing a coin twice can produce 0, 1 or 2 tails;
etc.

Yes, but it won't work if the numbers are too large for integers and have to use floating point.

If 2 billion loop iterations are not enough for you, you can always nest loops to multiple the range.

Many thanks JohnGG!

Re: Monte Carlo - Coin Toss
by roboticus (Chancellor) on Mar 12, 2011 at 13:53 UTC

Obviously your "modest laptop" is a bit zippier than my Atom-based desktop, as it took 3:10 to complete. But when I looked at your code, I was wondering why you did a monte-carlo simulation. Since there are only 20 coins, there are a little over a million combinations, so why not compute it exactly?

So I whipped this up:

```\$ cat 892293.pl
use strict;
use warnings;

my \$numTosses = 20;  # Coin tosses per experiment (20 in this example)
my \$runs = (1<<\$numTosses)-1;

my @tailCnt;
for (my \$collection=0; \$collection<1<<\$numTosses; ++\$collection) {
my \$tails=0;
\$tails += (\$collection & 1<<\$_) ? 1 : 0 for 0 .. \$numTosses;
\$tailCnt[\$tails]++;
}

print <<EOHDR;
Tails     Count     %
-----   --------  ------
EOHDR
for (my \$i = 0; \$i < \$numTosses+1; \$i++) {
printf "% 4u    % 8u   %5.2f\n",
\$i,  \$tailCnt[\$i],  100*\$tailCnt[\$i]/\$runs;
}

\$ time perl 892293.pl
0x00100000, 1048576
Tails     Count     %
-----   --------  ------
0           1    0.00
1          20    0.00
2         190    0.02
3        1140    0.11
4        4845    0.46
5       15504    1.48
6       38760    3.70
7       77520    7.39
8      125970   12.01
9      167960   16.02
10      184756   17.62
11      167960   16.02
12      125970   12.01
13       77520    7.39
14       38760    3.70
15       15504    1.48
16        4845    0.46
17        1140    0.11
18         190    0.02
19          20    0.00
20           1    0.00

real    0m20.372s
user    0m20.273s
sys     0m0.016s

It runs quite a bit faster on my machine, and it tried each combination once. Of course, I'm limited by the number of bits available, so large experiments aren't going to work well. You can use Bit::Vector, but I suspect that would be a good deal slower. (I wouldn't know, I didn't try it.)

...roboticus

When your only tool is a hammer, all problems look like your thumb.

You can use Bit::Vector, but I suspect that would be a good deal slower.

There wouldn't be any need to do that. With 32-bit ints, a full run would take around 7 hours. With 64-bit ints, it would take around 3 million years.

If you switched to using unpack to count the bits:

```    \$tailCnt[ unpack '%32b*', pack 'N', \$collection ]++;

Then the full 32-bit comes down to a very reasonable 40 minutes, but that still leaves the full 64-bit requiring several hundred thousand years.

Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.

Heh, yep, that would be a good deal slower, all right! Of course, if speed gets to be an issue, you can always skip counting all the iterations, and directly compute the binomial expansion:

```use strict;
use warnings;
use List::Util qw(reduce);

my \$numTosses = 20;
#my \$runs = (1<<\$numTosses)-1;

my @triangle = (0, 1, 0);
for (1 .. \$numTosses) {
my @newTriangle=(0);
push @newTriangle, \$triangle[\$_]+\$triangle[\$_+1] for 0 .. \$#triang
+le-1;
push @newTriangle, 0;
@triangle = @newTriangle;
}

print <<EOHDR;
Tails      Count     %
-----   ----------  ------
EOHDR
my \$runs = reduce { \$a + \$b } @triangle;
for (my \$i = 0; \$i < \$numTosses+1; \$i++) {
printf "% 4u    % 10u   %5.2f\n",
\$i, \$triangle[\$i+1], 100*\$triangle[\$i+1]/\$runs;
}

It runs a good bit faster:

```\$ time perl 892293.pl
Name "main::a" used only once: possible typo at 892293.pl line 21.
Name "main::b" used only once: possible typo at 892293.pl line 21.
Tails      Count     %
-----   ----------  ------
0             1    0.00
1            20    0.00
2           190    0.02
3          1140    0.11
4          4845    0.46
5         15504    1.48
6         38760    3.70
7         77520    7.39
8        125970   12.01
9        167960   16.02
10        184756   17.62
11        167960   16.02
12        125970   12.01
13         77520    7.39
14         38760    3.70
15         15504    1.48
16          4845    0.46
17          1140    0.11
18           190    0.02
19            20    0.00
20             1    0.00

real    0m0.034s
user    0m0.024s
sys     0m0.012s

Even if you use 32 bits:

```\$ time perl 892293.pl
Name "main::a" used only once: possible typo at 892293.pl line 21.
Name "main::b" used only once: possible typo at 892293.pl line 21.
Tails      Count     %
-----   ----------  ------
0             1    0.00
1            32    0.00
2           496    0.00
3          4960    0.00
4         35960    0.00
5        201376    0.00
6        906192    0.02
7       3365856    0.08
8      10518300    0.24
9      28048800    0.65
10      64512240    1.50
11     129024480    3.00
12     225792840    5.26
13     347373600    8.09
14     471435600   10.98
15     565722720   13.17
16     601080390   13.99
17     565722720   13.17
18     471435600   10.98
19     347373600    8.09
20     225792840    5.26
21     129024480    3.00
22      64512240    1.50
23      28048800    0.65
24      10518300    0.24
25       3365856    0.08
26        906192    0.02
27        201376    0.00
28         35960    0.00
29          4960    0.00
30           496    0.00
31            32    0.00
32             1    0.00

real    0m0.034s
user    0m0.028s
sys     0m0.004s

...roboticus

When your only tool is a hammer, all problems look like your thumb.

Update: When I use bignum;, it prints the PDF for 64 tosses per run in just over a second, but I haven't got it formatted nicely...

```use strict;
use warnings;
use List::Util qw(reduce);
use bignum;

my \$numTosses = 64;
#my \$runs = (1<<\$numTosses)-1;

my @triangle = (0, 1, 0);
for (1 .. \$numTosses) {
my @newTriangle=(0);
push @newTriangle, \$triangle[\$_]+\$triangle[\$_+1] for 0 .. \$#triang
+le-1;
push @newTriangle, 0;
@triangle = @newTriangle;
}

print <<EOHDR;
Tails      Count     %
-----   ----------  ------
EOHDR
my \$runs = reduce { \$a + \$b } @triangle;
for (my \$i = 0; \$i < \$numTosses+1; \$i++) {
print "\$i\t\$triangle[\$i+1]\t",100*\$triangle[\$i+1]/\$runs, "\n";
}

I guess I'll have to figure out how to make bugnum and printf cooperate better...

...find the probability density function of distribution (pdf) of tossing 20 coins at a time.

Let me add a pedantic point. When asked to find a PDF (e.g., in a statistical theory class), the requester is likely to mean that the function should be found analytically. That said, it is equally valid to compute it exactly by enumeration (as roboticus did).

Someone looking for a Monte Carlo approach will typically ask you to estimate the PDF.

Create A New User
Node Status?
node history
Node Type: CUFP [id://892293]
Approved by ww
Front-paged by Arunbear
help
Chatterbox?
and dust plays in a shaft of sunlight...

How do I use this? | Other CB clients
Other Users?
Others cooling their heels in the Monastery: (5)
As of 2017-08-18 21:27 GMT
Sections?
Information?
Find Nodes?
Leftovers?
Voting Booth?
Who is your favorite scientist and why?

Results (310 votes). Check out past polls.

Notices?