Beefy Boxes and Bandwidth Generously Provided by pair Networks
go ahead... be a heretic
 
PerlMonks  

Re^17: High Performance Game of Life (updated - results)

by eyepopslikeamosquito (Archbishop)
on Aug 15, 2017 at 22:38 UTC ( [id://1197471]=note: print w/replies, xml ) Need Help??


in reply to Re^16: High Performance Game of Life (updated - results)
in thread High Performance Game of Life

Re having an old PC - and rubbing salt into tybalt89's wounds - please note that my (Haswell 4770K-powered) home PC is unchanged from when I mentioned it over three years ago in The 10**21 Problem (Part 2) and so is getting rather old and creaky now. Definitely due for an upgrade. :)

Though my ancient PC does have 32 GB of memory (and so there is no swapping), given my very slow benchmark results with Game::Life::Infinite::Board, I'm suspicious that my doddering, ancient memory is starting to fail, due to being so old. This may be contributing to the significantly different results I am seeing.

When running:

perl tbench1.pl x.tmp 2
with my originally shortened Organism.pm:
package Organism; use strict; # use warnings; sub count { return scalar keys %{ shift->{Cells} }; } # Input a list of [ x, y ] coords sub insert_cells { my $cells = shift->{Cells}; for my $r (@_) { $cells->{ pack 'i2', @{$r} } = undef } } # Return sorted list of cells in the Organism. # Used for verification and testing the state of the organism. sub get_live_cells { sort { $a->[0] <=> $b->[0] || $a->[1] <=> $b->[1] } map { [ unpack 'i2', $_ ] } keys %{ shift->{Cells} }; } sub tick { my $self = shift; my $cells = $self->{Cells}; my ( $k1, $k2, $k3, $k4, $k5, $k6, $k7, $k8, $x0, $x1, $x2, $y0, $y1, $y2, %new_cells ); for my $c (keys %{ $cells }) { # Get the (up to 8) dead cells surrounding the cell ( $x0, $y0 ) = unpack 'i2', $c; ( $x1, $x2, $y1, $y2 ) = ( $x0 - 1, $x0 + 1, $y0 - 1, $y0 + 1 ); my @zcells = ( ($k1 = pack 'i2', $x1, $y1) x !(exists $cells->{$k1}), ($k2 = pack 'i2', $x1, $y0) x !(exists $cells->{$k2}), ($k3 = pack 'i2', $x1, $y2) x !(exists $cells->{$k3}), ($k4 = pack 'i2', $x0, $y1) x !(exists $cells->{$k4}), ($k5 = pack 'i2', $x0, $y2) x !(exists $cells->{$k5}), ($k6 = pack 'i2', $x2, $y1) x !(exists $cells->{$k6}), ($k7 = pack 'i2', $x2, $y0) x !(exists $cells->{$k7}), ($k8 = pack 'i2', $x2, $y2) x !(exists $cells->{$k8}) ); # Check the live cell # Note: next line equivalent to nlive == 2 || nlive == 3 @zcells == 5 || @zcells == 6 and $new_cells{$c} = undef; # Check the dead cells for my $z (@zcells) { ( $x0, $y0 ) = unpack 'i2', $z; ( $x1, $x2, $y1, $y2 ) = ( $x0 - 1, $x0 + 1, $y0 - 1, $y0 + 1 + ); # Get num live ( ( exists $cells->{ pack 'i2', $x1, $y1 } ) + ( exists $cells->{ pack 'i2', $x1, $y0 } ) + ( exists $cells->{ pack 'i2', $x1, $y2 } ) + ( exists $cells->{ pack 'i2', $x0, $y1 } ) + ( exists $cells->{ pack 'i2', $x0, $y2 } ) + ( exists $cells->{ pack 'i2', $x2, $y1 } ) + ( exists $cells->{ pack 'i2', $x2, $y0 } ) + ( exists $cells->{ pack 'i2', $x2, $y2 } ) ) == 3 and $new_cells{$z} = undef; } } $self->{Cells} = \%new_cells; } sub new { my $class = shift; my %init_self = ( Cells => {} ); bless \%init_self, $class; } 1;
I get 62 seconds, while after making the excellent improvements suggested by tybalt89 I get 80 seconds.
package Organism; use strict; # use warnings; sub count { return scalar keys %{ shift->{Cells} }; } # Input a list of [ x, y ] coords sub insert_cells { my $cells = shift->{Cells}; for my $r (@_) { $cells->{ pack 'i2', @{$r} } = undef } } # Return sorted list of cells in the Organism. # Used for verification and testing the state of the organism. sub get_live_cells { sort { $a->[0] <=> $b->[0] || $a->[1] <=> $b->[1] } map { [ unpack 'i2', $_ ] } keys %{ shift->{Cells} }; } sub tick { my $self = shift; my $cells = $self->{Cells}; my ( $k1, $k2, $k3, $k4, $k5, $k6, $k7, $k8, $x0, $x1, $x2, $y0, $y1, $y2, %new_cells, %dead_cells ); for my $c (keys %{ $cells }) { # Get the (up to 8) dead cells surrounding the cell ( $x0, $y0 ) = unpack 'i2', $c; ( $x1, $x2, $y1, $y2 ) = ( $x0 - 1, $x0 + 1, $y0 - 1, $y0 + 1 ); $dead_cells{$_}++ for my @zcells = ( ($k1 = pack 'i2', $x1, $y1) x !(exists $cells->{$k1}), ($k2 = pack 'i2', $x1, $y0) x !(exists $cells->{$k2}), ($k3 = pack 'i2', $x1, $y2) x !(exists $cells->{$k3}), ($k4 = pack 'i2', $x0, $y1) x !(exists $cells->{$k4}), ($k5 = pack 'i2', $x0, $y2) x !(exists $cells->{$k5}), ($k6 = pack 'i2', $x2, $y1) x !(exists $cells->{$k6}), ($k7 = pack 'i2', $x2, $y0) x !(exists $cells->{$k7}), ($k8 = pack 'i2', $x2, $y2) x !(exists $cells->{$k8}) ); # Check the live cell # Note: next line equivalent to nlive == 2 || nlive == 3 @zcells == 5 || @zcells == 6 and $new_cells{$c} = undef; } $dead_cells{$_} == 3 and $new_cells{$_} = undef for keys %dead_cell +s; $self->{Cells} = \%new_cells; } sub new { my $class = shift; my %init_self = ( Cells => {} ); bless \%init_self, $class; } 1;

Update: These tests were run with perl v5.24.0:

This is perl 5, version 24, subversion 0 (v5.24.0) built for MSWin32-x +64-multi-thread
Need to upgrade that ancient Perl. :)

Update: After upgrading to perl v5.26.0, both are faster: (62 secs, 80 secs) improved to (53 secs, 71 secs).

Replies are listed 'Best First'.
Re^18: High Performance Game of Life (updated - results)
by marioroy (Prior) on Aug 16, 2017 at 02:42 UTC

    Hi eyepopslikeamosquito,

    Oh I wished I had something older to run on. Unfortunately, my laptop is the slowest machine I have. The Haswell/Crystalwell chip has 128MB of eDRAM. It's purpose is for the GPU. But, the 4 cores has access to it as well when the GPU is idle, which is the case while benchmarking Perl. The late 2013 i7-4960HQ CPU runs at 2.6 GHz with turbo-boost at 3.8 GHz. I've forgotten that it could run that high on one core with nothing else running.

    I have some news to share. It's possible for tybalt89's excellent optimization to run faster. The extra optimization shaves 6 ~ 7 seconds. Memory consumption reduced as well. I applied the same optimization to the two bit implementations, mine and tybalt89's shorter implementation. For the latter, also applied a change mentioned earlier to not fail with cperl.

    # use 30-bits on 64-bit hw for cperl compatibility my $half = ( ( log(~0 + 1) / log(2) ) >= 64 ) ? 30 : 16;

    Before:

    $ perl createblinker.pl 500000 -900000 100 >x.tmp 2>y.tmp $ perl tbench1.pl x.tmp 2 mem size v5.26.0 v5.24.2 v5.22.4 cperl infinite 7,548 MB 122 secs 119 secs 133 secs 116 secs original 704 MB 167 secs 168 secs 180 secs 163 secs improvements 744 MB 57 secs 57 secs 63 secs 54 secs optimized i2 1,543 MB 38 secs 39 secs 40 secs 36 secs 2 nums into 1 1,510 MB 39 secs 40 secs 42 secs 37 secs shorter impl 1,661 MB 40 secs 42 secs 44 secs 37 secs

    After:

    mem size v5.26.0 v5.24.2 v5.22.4 cperl optimized i2 1,326 MB 31 secs 32 secs 33 secs 30 secs 2 nums into 1 1,292 MB 33 secs 34 secs 35 secs 31 secs shorter impl 1,443 MB 35 secs 36 secs 37 secs 31 secs

    The optimization was made to "Check dead cells".

    package Organism; use strict; # use warnings; sub count { return scalar keys %{ shift->{Cells} }; } # Input a list of [ x, y ] coords sub insert_cells { my $cells = shift->{Cells}; for my $r (@_) { $cells->{ pack 'i2', @{$r} } = undef } } # Return sorted list of cells in the Organism. # Used for verification and testing the state of the organism. sub get_live_cells { sort { $a->[0] <=> $b->[0] || $a->[1] <=> $b->[1] } map { [ unpack 'i2', $_ ] } keys %{ shift->{Cells} }; } sub tick { my $self = shift; my $cells = $self->{Cells}; my ( $k1, $k2, $k3, $k4, $k5, $k6, $k7, $k8, $x0, $x1, $x2, $y0, $y1, $y2, %new_cells, %dead_cells ); for my $c (keys %{ $cells }) { # Get the (up to 8) dead cells surrounding the cell ( $x0, $y0 ) = unpack 'i2', $c; ( $x1, $x2, $y1, $y2 ) = ( $x0 - 1, $x0 + 1, $y0 - 1, $y0 + 1 ); my @zcells = ( ($k1 = pack 'i2', $x1, $y1) x !(exists $cells->{$k1}), ($k2 = pack 'i2', $x1, $y0) x !(exists $cells->{$k2}), ($k3 = pack 'i2', $x1, $y2) x !(exists $cells->{$k3}), ($k4 = pack 'i2', $x0, $y1) x !(exists $cells->{$k4}), ($k5 = pack 'i2', $x0, $y2) x !(exists $cells->{$k5}), ($k6 = pack 'i2', $x2, $y1) x !(exists $cells->{$k6}), ($k7 = pack 'i2', $x2, $y0) x !(exists $cells->{$k7}), ($k8 = pack 'i2', $x2, $y2) x !(exists $cells->{$k8}) ); # Check the live cell # Note: next line equivalent to nlive == 2 || nlive == 3 @zcells == 5 || @zcells == 6 and $new_cells{$c} = undef; # Check the dead cells for my $z ( @zcells ) { $new_cells{$z} = undef if ++$dead_cells{$z} == 3; } } $self->{Cells} = \%new_cells; } sub new { my $class = shift; my %init_self = ( Cells => {} ); bless \%init_self, $class; } 1;

    Regards, Mario

      I have some news to share. It's possible for tybalt89's excellent optimization to run faster. ... The optimization was made to "Check dead cells".

      I don't believe your optimization of tybalt89's original code is correct. At least, it does not pass my more rigorous tgol3.t test program, shown below.

      Please test your solutions for correctness against this more thorough test (in addition to the original simple tgol.t blinker test).

      # tgol3.t - Simple lidka test of Conway Game of Life Organism class use strict; use warnings; use Organism; use Test::More; my $nticks = 100; my $ntests = 3 + ( $nticks + 1) * 2; plan tests => $ntests; sub test_one { my $org = shift; # Organism handle my $desc = shift; # Test description my $nexpected = shift; # Expected cell count # my $expected = shift; # Array ref of (sorted) expected cells my @cells = $org->get_live_cells(); my $ncells = $org->count(); cmp_ok( $ncells, '==', $nexpected, "$desc cell count ($ncells)" ); cmp_ok( scalar(@cells), '==', $nexpected, "$desc cell array count" +); # is_deeply( \@cells, $expected, "$desc cell array" ); } # Test first 100 ticks from famous lidka methuselah # See: http://conwaylife.com/wiki/Lidka my @lidka_ticks = ( 13, # [ 0] initial count 15, # [ 1] 15, # [ 2] 19, # [ 3] 19, # [ 4] 23, # [ 5] 23, # [ 6] 32, # [ 7] 29, # [ 8] 47, # [ 9] 27, # [10] 32, # [11] 36, # [12] 42, # [13] 48, # [14] 48, # [15] 46, # [16] 60, # [17] 54, # [18] 56, # [19] 64, # [20] 86, # [21] 64, # [22] 74, # [23] 70, # [24] 68, # [25] 52, # [26] 58, # [27] 50, # [28] 44, # [29] 50, # [30] 54, # [31] 80, # [32] 50, # [33] 54, # [34] 56, # [35] 54, # [36] 62, # [37] 50, # [38] 58, # [39] 56, # [40] 70, # [41] 60, # [42] 48, # [43] 52, # [44] 56, # [45] 72, # [46] 70, # [47] 68, # [48] 78, # [49] 86, # [50] 82, # [51] 93, # [52] 98, # [53] 94, # [54] 110, # [55] 87, # [56] 95, # [57] 79, # [58] 88, # [59] 80, # [60] 69, # [61] 76, # [62] 91, # [63] 89, # [64] 93, # [65] 112, # [66] 108, # [67] 140, # [68] 129, # [69] 157, # [70] 138, # [71] 147, # [72] 129, # [73] 111, # [74] 101, # [75] 105, # [76] 98, # [77] 117, # [78] 106, # [79] 114, # [80] 131, # [81] 124, # [82] 132, # [83] 118, # [84] 128, # [85] 133, # [86] 128, # [87] 140, # [88] 129, # [89] 126, # [90] 140, # [91] 147, # [92] 168, # [93] 163, # [94] 174, # [95] 164, # [96] 170, # [97] 152, # [98] 150, # [99] 144, # [100] ); # Lidka cells after 100 ticks my @lidka100 = ( [ -29, 2 ], # 1-10 [ -28, 1 ], [ -28, 2 ], [ -28, 3 ], [ -27, 0 ], [ -27, 4 ], [ -26, 0 ], [ -26, 1 ], [ -26, 4 ], [ -26, 5 ], [ -25, 3 ], # 10-20 [ -25, 4 ], [ -24, 2 ], [ -24, 3 ], [ -23, 1 ], [ -23, 2 ], [ -22, 0 ], [ -21, 0 ], [ -21, 1 ], [ -17, -2 ], [ -17, -1 ], # 20-30 [ -16, -2 ], [ -16, -1 ], [ -12, 8 ], [ -11, 9 ], [ -11, 10 ], [ -11, 16 ], [ -11, 17 ], [ -10, 8 ], [ -10, 9 ], [ -10, 10 ], # 30-40 [ -10, 15 ], [ -10, 16 ], [ -10, 18 ], [ -10, 19 ], [ -9, 18 ], [ -9, 19 ], [ -8, 13 ], [ -8, 18 ], [ -8, 19 ], [ -7, 12 ], # 40-50 [ -6, -19 ], [ -6, -18 ], [ -6, 16 ], [ -6, 17 ], [ -5, -19 ], [ -5, -17 ], [ -5, 12 ], [ -5, 16 ], [ -5, 17 ], [ -4, -19 ], # 50-60 [ -4, 13 ], [ -4, 14 ], [ -4, 15 ], [ -4, 16 ], [ -4, 18 ], [ -3, 14 ], [ -3, 15 ], [ -3, 18 ], [ -2, 17 ], [ -2, 18 ], # 60-70 [ -1, -13 ], [ -1, 6 ], [ 0, -14 ], [ 0, -13 ], [ 0, -12 ], [ 0, 5 ], [ 0, 6 ], [ 0, 7 ], [ 1, -15 ], [ 1, -14 ], # 70-80 [ 1, -12 ], [ 1, 6 ], [ 1, 9 ], [ 1, 10 ], [ 2, -22 ], [ 2, -15 ], [ 2, 3 ], [ 2, 6 ], [ 3, -22 ], [ 3, -16 ], # 80-90 [ 3, 2 ], [ 3, 3 ], [ 3, 4 ], [ 3, 5 ], [ 3, 9 ], [ 4, -22 ], [ 4, -17 ], [ 4, -16 ], [ 4, -15 ], [ 4, -12 ], # 90-100 [ 4, -11 ], [ 4, 3 ], [ 4, 8 ], [ 5, -16 ], [ 5, -14 ], [ 5, -12 ], [ 5, 7 ], [ 6, -16 ], [ 6, -15 ], [ 6, -14 ], # 100-110 [ 6, 4 ], [ 6, 6 ], [ 7, -15 ], [ 7, -14 ], [ 7, 4 ], [ 7, 12 ], [ 7, 13 ], [ 8, -14 ], [ 8, 12 ], [ 8, 13 ], # 110-120 [ 9, -14 ], [ 9, -13 ], [ 9, -4 ], [ 9, -2 ], [ 9, 10 ], [ 9, 11 ], [ 10, -5 ], [ 10, -1 ], [ 10, 10 ], [ 10, 11 ], # 120-130 [ 11, -1 ], [ 11, 0 ], [ 12, -7 ], [ 12, -1 ], [ 12, 0 ], [ 13, -8 ], [ 13, -7 ], [ 13, -3 ], [ 13, -2 ], [ 13, -1 ], # 130-140 [ 14, -8 ], [ 14, -3 ], [ 14, -2 ], [ 14, 1 ], [ 15, -7 ], [ 15, -6 ], [ 15, -5 ], [ 15, -1 ], [ 15, 0 ], [ 15, 1 ], # 140-144 [ 16, -7 ], [ 16, -6 ], [ 16, -5 ], ); my @slidka100 = sort { $a->[0] <=> $b->[0] || $a->[1] <=> $b->[1] } @l +idka100; # Lidka starting pattern my @lidka0 = ( [ -3, -7 ], [ -4, -6 ], [ -2, -6 ], [ -3, -5 ], [ 4, 3 ], [ 2, 4 ], [ 4, 4 ], [ 1, 5 ], [ 2, 5 ], [ 4, 5 ], [ 0, 7 ], [ 1, 7 ], [ 2, 7 ], ); my @slidka0 = sort { $a->[0] <=> $b->[0] || $a->[1] <=> $b->[1] } @lid +ka0; # Initial cell array my $org = Organism->new(); $org->insert_cells(@lidka0); { my @cells = $org->get_live_cells(); is_deeply( \@cells, \@slidka0, "lidka initial cell array" ); test_one( $org, "lidka 0", $lidka_ticks[0] ); } # Test first 100 ticks for my $i ( 1 .. $nticks ) { $org->tick(); test_one( $org, "lidka $i", $lidka_ticks[$i] ); } # Final cell array { my @cells = $org->get_live_cells(); cmp_ok( scalar(@cells), '==', $lidka_ticks[100], "lidka final array + count" ); is_deeply( \@cells, \@slidka100, "lidka final cell array" ); }

        Thank you, eyepopslikeamosquito. I reverted the failing optimization.

        To be sure, I ran the improved test script against the following. All passing, including the two involving bit manipulations.

        $ perl createblinker.pl 500000 -900000 100 >x.tmp 2>y.tmp $ perl tbench1.pl x.tmp 2 mem size v5.26.0 v5.24.2 v5.22.4 cperl infinite 7,548 MB 122 secs 119 secs 133 secs 116 secs original 704 MB 167 secs 168 secs 180 secs 163 secs improvements 744 MB 57 secs 57 secs 63 secs 54 secs optimized i2 1,543 MB 38 secs 39 secs 40 secs 36 secs 2 nums into 1 1,510 MB 39 secs 40 secs 42 secs 37 secs shorter impl 1,661 MB 40 secs 42 secs 44 secs 37 secs

        It's been an interesting ride and learned a lot of things about Perl.

        Regards, Mario

        On first glance, that seems to allow new cells to be generated when 3 or more neighbors are alive, instead of exactly 3.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://1197471]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others examining the Monastery: (6)
As of 2024-09-17 15:33 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    The PerlMonks site front end has:





    Results (22 votes). Check out past polls.

    Notices?
    erzuuli‥ 🛈The London Perl and Raku Workshop takes place on 26th Oct 2024. If your company depends on Perl, please consider sponsoring and/or attending.