perlquestion
Cristoforo
<b>Update:</b> I believe the 2 programs below failed to count the steps. Especially, the first simulation 'hops' for 1 to 6 steps and counts that as 1 step. That is probably wrong and instead should count the cells covered in a hop. This revised program does that.
<p>
<c>
#!/usr/bin/perl
use strict;
use warnings;
use Storable;
use Statistics::Descriptive;
my @grid = @{ retrieve('grid.dat') };
my %param = %{ retrieve('param.dat') }; # rows and cols parameters
my @contaminated_walks;
for (1 .. 100) {
my $num_walks = 100;
my ($x, $y) = (int(rand $param{range_contx}), int(rand $param{range_conty}));
my @walks;
for (1 .. $num_walks) {
my $steps = 100;
my $infected;
my $total_steps;
#Inner loop to perform each step of a random walk
for (1 .. $steps) {
$total_steps += my $rand_steps = (1 + int( rand 6 ));
last if $total_steps > $steps;
my $random_num = rand;
if($random_num < 0.25) {
for (1 .. $rand_steps) {
$x = ($x - 1) % $param{range_contx};
$infected += $grid[$x][$y] || 0;
}
}
elsif ($random_num < 0.5) {
for (1 .. $rand_steps) {
$x = ($x + 1) % $param{range_contx};
$infected += $grid[$x][$y] || 0;
}
}
elsif ($random_num < 0.75) {
for (1 .. $rand_steps) {
$y = ($y - 1) % $param{range_conty};
$infected += $grid[$x][$y] || 0;
}
}
else {
for (1 .. $rand_steps) {
$y = ($y + 1) % $param{range_conty};
$infected += $grid[$x][$y] || 0;
}
}
}
push @walks, $infected;
}
push @contaminated_walks, scalar grep $_, @walks;
}
my $stat = Statistics::Descriptive::Sparse->new();
$stat->add_data(@contaminated_walks);
printf "min-max %d-%d mean: %.1f std. deviation: %.1f count: %d\n",
$stat->min, $stat->max, $stat->mean, $stat->standard_deviation, $stat->count;
__END__
C:\Old_Data\perlp>perl t33.pl
min-max 34-61 mean: 50.7 std. deviation: 5.6 count: 100
</c>
I would like to see if someone might explain why I'm getting different results from nearly identical, (I'll explain the difference below), simulation runs. The problem was posed on Perl Guru Forums here [http://perlguru.com/gforum.cgi?post=71868;sb=post_latest_reply;so=ASC;forum_view=forum_view_collapsed;guest=5787217|Creating a 100x100 grid in perl]. I am not a student for this problem - just trying to solve the problem for myself. :-)
<p>
The intent of the simulation is to randomly move around a grid a specified number of steps and at the end, see if you stepped upon an infected cell, (and then become infected).
The specification was to create a 100 x 100 grid and moving randomly 1 to 6 cells, (up, down, left or right), move around the grid and record any steps upon an infected cell. (the specs had 100 infected out of 10,000) The specs also said to create an unchanging grid and run different simulations with an identical grid. (My script doesn't use an unchanging grid, but I don't think thats the problem here).
<p>Here are the grid creation script, <readmore><c>#!/usr/bin/perl
use strict;
use warnings;
use Storable;
my $number_contaminant = 100;
my %param = (range_contx => 100,
range_conty => 100 );
my @grid;
for (1 .. $number_contaminant) {
#random positions of the contaminants, put the random number as integrer
my $x = int(rand $param{range_contx});
my $y = int(rand $param{range_conty});
redo if $grid[$x][$y]; # if already marked
$grid[$x][$y] = 1;
}
store \@grid, 'grid.dat';
store \%param, 'param.dat';
</c></readmore> and the simulation run against the grid, <readmore><c>#!/usr/bin/perl
use strict;
use warnings;
use Storable;
use Statistics::Descriptive;
my @grid = @{ retrieve('grid.dat') };
my %param = %{ retrieve('param.dat') }; # range_contx and range_conty parameters
my @contaminated_walks;
for (1 .. 100) {
my $steps = 100;
my $num_walks = 100;
my ($x, $y) = (int(rand $param{range_contx}), int(rand $param{range_conty}));
my @walks;
for (1 .. $num_walks) {
my $infected;
#Inner loop to perform each step of a random walk
for (1 .. $steps) {
my $random_num = rand;
my $steps = (1 + int( rand 6 ));
if($random_num < 0.25) {
$x = ($x - $steps) % $param{range_contx};
}
elsif ($random_num < 0.5) {
$x = ($x + $steps) % $param{range_contx};
}
elsif ($random_num < 0.75) {
$y = ($y - $steps) % $param{range_conty};
}
else {
$y = ($y + $steps) % $param{range_conty};
}
$infected += $grid[$x][$y] || 0;
}
push @walks, $infected;
}
push @contaminated_walks, scalar grep $_, @walks;
}
my $stat = Statistics::Descriptive::Sparse->new();
$stat->add_data(@contaminated_walks);
printf "mean: %.f std. deviation: %.f count: %d\n",
$stat->mean, $stat->standard_deviation, $stat->count;
__END__
C:\Old_Data\perlp>perl his_stat.pl
mean: 60 std. deviation: 5 count: 100
</c></readmore>
<p>And, here is a solution I made that moves, in effect, to any random cell, (unrestrained by the 1 - 6 move in the other program). <readmore><c>#!/usr/bin/perl
use strict;
use warnings;
use List::Util qw/ shuffle /;
use Statistics::Descriptive;
my @contaminated_walks;
for (1 .. 100) {
my $contaminated = 100;
my $grid_rows = 100;
my $grid_cols = 100;
my $steps = 100;
my $num_walks = 100;
my @walks;
for (1 .. $num_walks) {
my @grid = # 1's and 0's randomly ordered
shuffle( (1) x $contaminated,
(0) x ($grid_rows * $grid_cols - $contaminated)
);
# for each walk, 'grep' gets the count of contaminated cells
push @walks, scalar grep $_, map $grid[rand @grid], 1 .. $steps;
}
push @contaminated_walks, scalar grep $_, @walks;
}
my $stat = Statistics::Descriptive::Sparse->new();
$stat->add_data(@contaminated_walks);
printf "mean: %.f std. deviation: %.f count: %d\n",
$stat->mean, $stat->standard_deviation, $stat->count;
__END__
C:\Old_Data\perlp>perl my_stat.pl
mean: 64 std. deviation: 5 count: 100
</c></readmore>
<p> My analysis was based on the likelyhood of stepping upon an uninfected cell (when there were 100 contaminated out of 10,000, (100 x 100)), was 9900/10,000 (or .99 probabilty). Then, for 100 runs, figured the likelyhood of not becoming infected was .99 ** 100, (.99 to the 100th power).
<p>Thus the possibilty of being infected would be 1 - .99**100. (Because, in my simulation script, each step is independent of the one preceding it).
<p>The method to calculate his is slightly different because, he must move at least 1 square and cannot randomly stay in the same cell, (taking 1 of the 9900 uninfected cells out of play for subsequent steps), as my solution can. I think then to get the probability here would be .99 for the first cell to be uninfected and the following cells visited to be .9899, .99 * (.9899 ** 99). Subtracting that from 1 should give the probabilty of infection.
<p>My script's outcome gave .64 chance of being infected while his gave .60 chance. And I don't know why, they are, for practical purposes with the given parameters the same and I would have expected to get the same probabilty of infection for his run, but as seen, it is .04 less.
<p>I guess I wonder why this is happening. Is my calculation in error or is my simulation code wrong?
I don't think either is the case, although someone else might point out an error.
<p>Update: I don't know why my readmore tags didn't work for the code portions.