Cristoforo has asked for the wisdom of the Perl Monks concerning the following question:
I would like to see if someone might explain why I'm getting different results from nearly identical, (I'll explain the difference below), simulation runs. The problem was posed on Perl Guru Forums here Creating a 100x100 grid in perl. I am not a student for this problem - just trying to solve the problem for myself. :-)#!/usr/bin/perl use strict; use warnings; use Storable; use Statistics::Descriptive; my @grid = @{ retrieve('grid.dat') }; my %param = %{ retrieve('param.dat') }; # rows and cols parameters my @contaminated_walks; for (1 .. 100) { my $num_walks = 100; my ($x, $y) = (int(rand $param{range_contx}), int(rand $param{rang +e_conty})); my @walks; for (1 .. $num_walks) { my $steps = 100; my $infected; my $total_steps; #Inner loop to perform each step of a random walk for (1 .. $steps) { $total_steps += my $rand_steps = (1 + int( rand 6 )); last if $total_steps > $steps; my $random_num = rand; if($random_num < 0.25) { for (1 .. $rand_steps) { $x = ($x - 1) % $param{range_contx}; $infected += $grid[$x][$y] || 0; } } elsif ($random_num < 0.5) { for (1 .. $rand_steps) { $x = ($x + 1) % $param{range_contx}; $infected += $grid[$x][$y] || 0; } } elsif ($random_num < 0.75) { for (1 .. $rand_steps) { $y = ($y - 1) % $param{range_conty}; $infected += $grid[$x][$y] || 0; } } else { for (1 .. $rand_steps) { $y = ($y + 1) % $param{range_conty}; $infected += $grid[$x][$y] || 0; } } } push @walks, $infected; } push @contaminated_walks, scalar grep $_, @walks; } my $stat = Statistics::Descriptive::Sparse->new(); $stat->add_data(@contaminated_walks); printf "min-max %d-%d mean: %.1f std. deviation: %.1f count: %d\n", $stat->min, $stat->max, $stat->mean, $stat->standard_deviation, $s +tat->count; __END__ C:\Old_Data\perlp>perl t33.pl min-max 34-61 mean: 50.7 std. deviation: 5.6 count: 100
The intent of the simulation is to randomly move around a grid a specified number of steps and at the end, see if you stepped upon an infected cell, (and then become infected). The specification was to create a 100 x 100 grid and moving randomly 1 to 6 cells, (up, down, left or right), move around the grid and record any steps upon an infected cell. (the specs had 100 infected out of 10,000) The specs also said to create an unchanging grid and run different simulations with an identical grid. (My script doesn't use an unchanging grid, but I don't think thats the problem here).
Here are the grid creation script,
#!/usr/bin/perl use strict; use warnings; use Storable; my $number_contaminant = 100; my %param = (range_contx => 100, range_conty => 100 ); my @grid; for (1 .. $number_contaminant) { #random positions of the contaminants, put the random number as in +tegrer my $x = int(rand $param{range_contx}); my $y = int(rand $param{range_conty}); redo if $grid[$x][$y]; # if already marked $grid[$x][$y] = 1; } store \@grid, 'grid.dat'; store \%param, 'param.dat';
#!/usr/bin/perl use strict; use warnings; use Storable; use Statistics::Descriptive; my @grid = @{ retrieve('grid.dat') }; my %param = %{ retrieve('param.dat') }; # range_contx and range_conty +parameters my @contaminated_walks; for (1 .. 100) { my $steps = 100; my $num_walks = 100; my ($x, $y) = (int(rand $param{range_contx}), int(rand $param{rang +e_conty})); my @walks; for (1 .. $num_walks) { my $infected; #Inner loop to perform each step of a random walk for (1 .. $steps) { my $random_num = rand; my $steps = (1 + int( rand 6 )); if($random_num < 0.25) { $x = ($x - $steps) % $param{range_contx}; } elsif ($random_num < 0.5) { $x = ($x + $steps) % $param{range_contx}; } elsif ($random_num < 0.75) { $y = ($y - $steps) % $param{range_conty}; } else { $y = ($y + $steps) % $param{range_conty}; } $infected += $grid[$x][$y] || 0; } push @walks, $infected; } push @contaminated_walks, scalar grep $_, @walks; } my $stat = Statistics::Descriptive::Sparse->new(); $stat->add_data(@contaminated_walks); printf "mean: %.f std. deviation: %.f count: %d\n", $stat->mean, $stat->standard_deviation, $stat->count; __END__ C:\Old_Data\perlp>perl his_stat.pl mean: 60 std. deviation: 5 count: 100
And, here is a solution I made that moves, in effect, to any random cell, (unrestrained by the 1 - 6 move in the other program).
#!/usr/bin/perl use strict; use warnings; use List::Util qw/ shuffle /; use Statistics::Descriptive; my @contaminated_walks; for (1 .. 100) { my $contaminated = 100; my $grid_rows = 100; my $grid_cols = 100; my $steps = 100; my $num_walks = 100; my @walks; for (1 .. $num_walks) { my @grid = # 1's and 0's randomly ordered shuffle( (1) x $contaminated, (0) x ($grid_rows * $grid_cols - $contaminated) ); # for each walk, 'grep' gets the count of contaminated cells push @walks, scalar grep $_, map $grid[rand @grid], 1 .. $step +s; } push @contaminated_walks, scalar grep $_, @walks; } my $stat = Statistics::Descriptive::Sparse->new(); $stat->add_data(@contaminated_walks); printf "mean: %.f std. deviation: %.f count: %d\n", $stat->mean, $stat->standard_deviation, $stat->count; __END__ C:\Old_Data\perlp>perl my_stat.pl mean: 64 std. deviation: 5 count: 100
My analysis was based on the likelyhood of stepping upon an uninfected cell (when there were 100 contaminated out of 10,000, (100 x 100)), was 9900/10,000 (or .99 probabilty). Then, for 100 runs, figured the likelyhood of not becoming infected was .99 ** 100, (.99 to the 100th power).
Thus the possibilty of being infected would be 1 - .99**100. (Because, in my simulation script, each step is independent of the one preceding it).
The method to calculate his is slightly different because, he must move at least 1 square and cannot randomly stay in the same cell, (taking 1 of the 9900 uninfected cells out of play for subsequent steps), as my solution can. I think then to get the probability here would be .99 for the first cell to be uninfected and the following cells visited to be .9899, .99 * (.9899 ** 99). Subtracting that from 1 should give the probabilty of infection.
My script's outcome gave .64 chance of being infected while his gave .60 chance. And I don't know why, they are, for practical purposes with the given parameters the same and I would have expected to get the same probabilty of infection for his run, but as seen, it is .04 less.
I guess I wonder why this is happening. Is my calculation in error or is my simulation code wrong? I don't think either is the case, although someone else might point out an error.
Update: I don't know why my readmore tags didn't work for the code portions.
|
---|
Replies are listed 'Best First'. | |
---|---|
Re: Running a simulation - expected outcome
by BrowserUk (Patriarch) on Sep 29, 2012 at 20:20 UTC | |
by BillKSmith (Monsignor) on Sep 30, 2012 at 04:00 UTC | |
by BrowserUk (Patriarch) on Sep 30, 2012 at 05:35 UTC | |
by hippo (Bishop) on Oct 01, 2012 at 16:46 UTC | |
by BrowserUk (Patriarch) on Oct 01, 2012 at 18:41 UTC | |
by Cristoforo (Curate) on Oct 01, 2012 at 20:59 UTC | |
by BrowserUk (Patriarch) on Oct 01, 2012 at 21:28 UTC | |
Re: Running a simulation - expected outcome
by BrowserUk (Patriarch) on Sep 29, 2012 at 19:54 UTC | |
by Cristoforo (Curate) on Sep 29, 2012 at 20:13 UTC |