http://www.perlmonks.org?node_id=1197344


in reply to Re^8: High Performance Game of Life
in thread High Performance Game of Life

You need to run the following little test to catch any errors with negative x and y values before firing off any benchmarks:

# tgol.t - Simple blinker test of Conway Game of Life Organism class use strict; use warnings; use Organism; use Test::More; my $nblinks = 5; my $ntests = ( $nblinks + 1 ) * 3; plan tests => $ntests; sub test_one { my $org = shift; # Organism handle my $desc = shift; # Test description my $expected = shift; # Array ref of (sorted) expected cells my $nexpected = @{$expected}; my $ncells = $org->count(); my @cells = $org->get_live_cells(); cmp_ok( $ncells, '==', $nexpected, "$desc cell count ($ncells)" ); cmp_ok( scalar(@cells), '==', $nexpected, "$desc cell array count" +); is_deeply( \@cells, $expected, "$desc cell array" ); } # Blinker pattern my @blinker1 = ( [ -101, -100 ], [ -100, -100 ], [ -99, -100 ], [ -101, 100 ], [ -100, 100 ], [ -99, 100 ], [ -1, 0 ], [ 0, 0 ], [ 1, 0 ], [ 99, -100 ], [ 100, -100 ], [ 101, -100 ], [ 99, 100 ], [ 100, 100 ], [ 101, 100 ], ); my @blinker2 = ( [ -100, -99 ], [ -100, -100 ], [ -100, -101 ], [ -100, 99 ], [ -100, 100 ], [ -100, 101 ], [ 0, -1 ], [ 0, 0 ], [ 0, 1 ], [ 100, -99 ], [ 100, -100 ], [ 100, -101 ], [ 100, 99 ], [ 100, 100 ], [ 100, 101 ], ); my @sblinker1 = sort { $a->[0] <=> $b->[0] || $a->[1] <=> $b->[1] } @b +linker1; my @sblinker2 = sort { $a->[0] <=> $b->[0] || $a->[1] <=> $b->[1] } @b +linker2; # Initialization my $org = Organism->new(); $org->insert_cells(@blinker1); test_one( $org, "initial", \@sblinker1 ); # Pattern should just blink back and forth for my $i ( 1 .. $nblinks ) { $org->tick(); test_one( $org, "blinker $i", $i % 2 ? \@sblinker2 : \@sblinker1 ); }
with the command line:
prove tgol.t
Unfortunately, your latest effort fails this test on my machine as shown below:
# Failed test 'initial cell count (11)' # at tgol.t line 17. # got: 11 # expected: 15 # Failed test 'initial cell array count' # at tgol.t line 18. # got: 11 # expected: 15 # Failed test 'initial cell array' # at tgol.t line 19. # Structures begin differing at: # $got->[0][0] = '-281474976710655' # $expected->[0][0] = '-101' # Failed test 'blinker 1 cell count (10)' # at tgol.t line 17. # got: 10 # expected: 15 # Failed test 'blinker 1 cell array count' # at tgol.t line 18. # got: 10 # expected: 15 # Failed test 'blinker 1 cell array' # at tgol.t line 19. # Structures begin differing at: # $got->[0][0] = '-281474976710655' # $expected->[0][0] = '-100' # Failed test 'blinker 2 cell count (11)' # at tgol.t line 17. # got: 11 # expected: 15 # Failed test 'blinker 2 cell array count' # at tgol.t line 18. # got: 11 # expected: 15 # Failed test 'blinker 2 cell array' # at tgol.t line 19. # Structures begin differing at: # $got->[0][0] = '-281474976710655' # $expected->[0][0] = '-101' # Failed test 'blinker 3 cell count (9)' # at tgol.t line 17. # got: 9 # expected: 15 # Failed test 'blinker 3 cell array count' # at tgol.t line 18. # got: 9 # expected: 15 # Failed test 'blinker 3 cell array' # at tgol.t line 19. # Structures begin differing at: # $got->[0][1] = '99' # $expected->[0][1] = '-101' # Failed test 'blinker 4 cell count (9)' # at tgol.t line 17. # got: 9 # expected: 15 # Failed test 'blinker 4 cell array count' # at tgol.t line 18. # got: 9 # expected: 15 # Failed test 'blinker 4 cell array' # at tgol.t line 19. # Structures begin differing at: # $got->[0][1] = '100' # $expected->[0][1] = '-100' # Failed test 'blinker 5 cell count (8)' # at tgol.t line 17. # got: 8 # expected: 15 # Failed test 'blinker 5 cell array count' # at tgol.t line 18. # got: 8 # expected: 15 # Failed test 'blinker 5 cell array' # at tgol.t line 19. # Structures begin differing at: # $got->[0][1] = '99' # $expected->[0][1] = '-101' # Looks like you failed 18 tests of 18.

I left this tgol.t test program out of the root node because it was already way too long ... then forgot about it. Sorry 'bout that. Update: I've now remedied my oversight by adding the tgol.t test program above to the root node.

Replies are listed 'Best First'.
Re^10: High Performance Game of Life
by marioroy (Prior) on Aug 13, 2017 at 22:07 UTC

    Hi eyepopslikeamosquito. Yes, I've been running tgol2.t found here and it's been running fine. I can comfirm that bit-manipulation is failing with the new tgot2.t. The initial test script did not test for negative $y. Thus, assumed that $y was always positive. The bit-manipulation code will no longer work.

    Okay, will comment readers to your post and strike out the bit-manipulation sections. Thank you for posting Extra Test Program tgol.t.

    Update: For closure, I tested mapping supporting negative $x and $y. Pack('i2') is faster unless running cperl.

    use strict; use warnings; use Time::HiRes qw(time); my ( $x , $y , $iters ) = ( -890394, 100, 5_000_000 ); my ( $xx, $yy, $n ); ## # sub _pack { # my ( $x, $y ) = @_; # return # $x < 0 ? -( abs($x) << 16 | $y ) : $x << 16 | $y; # } # # sub _unpack { # my ( $n ) = @_; # return $n < 0 # ? ( -( abs($n) >> 16 ), abs($n) & 0xFFFF ) # : ( $n >> 16 , $n & 0xFFFF ); # } ## bench( "bitops ", sub { # map two integers $x and $y into $n # support negative $x only for ( 1 .. $iters ) { $n = $x < 0 ? -( abs($x) << 16 | $y ) : $x << 16 | $y; ( $xx, $yy ) = $n < 0 ? ( -( abs($n) >> 16 ), abs($n) & 0xFFFF ) : ( $n >> 16 , $n & 0xFFFF ); } }); ## # sub _pack { # my ( $x, $y ) = @_; # # bits 0,1 indicate neg flag for $x,$y respectively # return # ( abs($x) << 18 ) + ( $x < 0 ? 1 : 0 ) + # ( abs($y) << 2 ) + ( $y < 0 ? 2 : 0 ); # } # # sub _unpack { # my ( $n ) = @_; # # bits 0,1 indicate neg flag for $x,$y respectively # return ( # $n & 0x1 ? -($n >> 18 ) : $n >> 18, # $n & 0x2 ? -($n >> 2 & 0xFFFF) : $n >> 2 & 0xFFFF # ); # } ## bench( "bitops neg ", sub { # map two integers $x and $y into $n # support negative $x and $y for ( 1 .. $iters ) { $n = ( abs($x) << 18 ) + ( $x < 0 ? 1 : 0 ) + ( abs($y) << 2 ) + ( $y < 0 ? 2 : 0 ); ( $xx, $yy ) = ( $n & 0x1 ? -($n >> 18 ) : $n >> 18, $n & 0x2 ? -($n >> 2 & 0xFFFF) : $n >> 2 & 0xFFFF ); } }); bench( "(un)pack ii", sub { for ( 1 .. $iters ) { $n = pack 'ii', $x, $y; ( $xx, $yy ) = unpack 'ii', $n; } }); bench( "(un)pack i2", sub { for ( 1 .. $iters ) { $n = pack 'i2', $x, $y; ( $xx, $yy ) = unpack 'i2', $n; } }); exit; sub bench { my ( $start, $desc, $fcn ) = ( scalar time, @_ ); $fcn->(); printf "duration $desc %0.03f\n", time - $start; }

    Regards, Mario

      Update: Replaced bit-OR with addition to have $x and $y line up for better readability.

      This post is a fun study, comparing pack 'i2' against the mapping of two integers into one via bit manipulation. A use case for doing this is wanting readable keys for storing into a database using one field. This laptop runs an i7 Haswell at 2.6 GHz. Unfortunately, I do not have anything slower to run on.

      bin/perl v5.26.0

      $ perl createblinker.pl 500000 -900000 100 >x.tmp 2>y.tmp $ /opt/perl-5.26.0/bin/perl -I. tbench1.pl x.tmp 2 # pack i2 cell count at start = 1500000 run benchmark for 2 ticks cell count at end = 1500000 time taken: 37 secs time taken: 58 secs <- 32-bit Windows VM $ /opt/perl-5.26.0/bin/perl -I. tbench1.pl x.tmp 2 # mapping cell count at start = 1500000 run benchmark for 2 ticks cell count at end = 1500000 time taken: 39 secs time taken: 64 secs <- 32-bit Windows VM, applied 32-bit tip below

      bin/cperl v5.24.3c

      $ /opt/cperl-5.24.3c/bin/cperl -I. tbench1.pl x.tmp 2 # pack i2 cell count at start = 1500000 run benchmark for 2 ticks cell count at end = 1500000 time taken: 37 secs $ /opt/cperl-5.24.3c/bin/cperl -I. tbench1.pl x.tmp 2 # mapping cell count at start = 1500000 run benchmark for 2 ticks cell count at end = 1500000 time taken: 38 secs

      I used tybalt89's update and applied the mapping logic. All tests pass, thanks to new test script by eyepopslikeamosquito. Please ensure Perl is compiled with 64-bit support. 16 bits hold the value for $y and 2 bits for whether $x,$y are less than 0. $x is stored in the remaining bits. This results in minimum key lenght as $y isn't big. Though, adjust accordingly the number of bits to shift and bitmask if necessary.

      bits 63-18 contains the $x value bits 17-2 contains the $y value bit 1 set when $y < 0 bit 0 set when $x < 0

      On 32-bit hardware, replace 18 and 0xFFFF with 10 and 0xFF throughout the module.

      bits 31-10 contains the $x value bits 9-2 contains the $y value bit 1 set when $y < 0 bit 0 set when $x < 0

      Both _unpack and _pack are inlined inside tick for maximum performance.

      package Organism; use strict; # use warnings; sub _pack { my ( $x, $y ) = @_; # bits 0,1 negative flag for $x,$y respectively return ( abs($x) << 18 ) + ( $x < 0 ? 1 : 0 ) + ( abs($y) << 2 ) + ( $y < 0 ? 2 : 0 ); } sub _unpack { my ( $n ) = @_; # bits 0,1 negative flag for $x,$y respectively return ( $n & 0x1 ? -($n >> 18 ) : $n >> 18, $n & 0x2 ? -($n >> 2 & 0xFFFF) : $n >> 2 & 0xFFFF ); } sub count { return scalar keys %{ shift->{Cells} }; } # Input a list of [ x, y ] coords sub insert_cells { my $cells = shift->{Cells}; for my $r (@_) { $cells->{ _pack @{$r} } = undef } } # Return sorted list of cells in the Organism. # Used for verification and testing the state of the organism. sub get_live_cells { sort { $a->[0] <=> $b->[0] || $a->[1] <=> $b->[1] } map { [ _unpack $_ ] } keys %{ shift->{Cells} }; } sub tick { my $self = shift; my $cells = $self->{Cells}; my ( $k1, $k2, $k3, $k4, $k5, $k6, $k7, $k8, $x0, $x1, $x2, $y0, $y1, $y2, %new_cells, %dead_cells ); for my $c (keys %{ $cells }) { # Get the (up to 8) dead cells surrounding the cell ( $x0, $y0 ) = ( $c & 0x1 ? -($c >> 18 ) : $c >> 18, $c & 0x2 ? -($c >> 2 & 0xFFFF) : $c >> 2 & 0xFFFF ); ( $x1, $x2, $y1, $y2 ) = ( $x0 - 1, $x0 + 1, $y0 - 1, $y0 + 1 ); $dead_cells{$_}++ for my @zcells = ( ( $k1 = ( abs($x1) << 18 ) + ( $x1 < 0 ? 1 : 0 ) + ( abs($y1) << 2 ) + ( $y1 < 0 ? 2 : 0 ) ) x !(exists $cells->{$k1}), ( $k2 = ( abs($x1) << 18 ) + ( $x1 < 0 ? 1 : 0 ) + ( abs($y0) << 2 ) + ( $y0 < 0 ? 2 : 0 ) ) x !(exists $cells->{$k2}), ( $k3 = ( abs($x1) << 18 ) + ( $x1 < 0 ? 1 : 0 ) + ( abs($y2) << 2 ) + ( $y2 < 0 ? 2 : 0 ) ) x !(exists $cells->{$k3}), ( $k4 = ( abs($x0) << 18 ) + ( $x0 < 0 ? 1 : 0 ) + ( abs($y1) << 2 ) + ( $y1 < 0 ? 2 : 0 ) ) x !(exists $cells->{$k4}), ( $k5 = ( abs($x0) << 18 ) + ( $x0 < 0 ? 1 : 0 ) + ( abs($y2) << 2 ) + ( $y2 < 0 ? 2 : 0 ) ) x !(exists $cells->{$k5}), ( $k6 = ( abs($x2) << 18 ) + ( $x2 < 0 ? 1 : 0 ) + ( abs($y1) << 2 ) + ( $y1 < 0 ? 2 : 0 ) ) x !(exists $cells->{$k6}), ( $k7 = ( abs($x2) << 18 ) + ( $x2 < 0 ? 1 : 0 ) + ( abs($y0) << 2 ) + ( $y0 < 0 ? 2 : 0 ) ) x !(exists $cells->{$k7}), ( $k8 = ( abs($x2) << 18 ) + ( $x2 < 0 ? 1 : 0 ) + ( abs($y2) << 2 ) + ( $y2 < 0 ? 2 : 0 ) ) x !(exists $cells->{$k8}) ); # Check the live cell # Note: next line equivalent to nlive == 2 || nlive == 3 @zcells == 5 || @zcells == 6 and $new_cells{$c} = undef; } $dead_cells{$_} == 3 and $new_cells{$_} = undef for keys %dead_cell +s; $self->{Cells} = \%new_cells; } sub new { my $class = shift; my %init_self = ( Cells => {} ); bless \%init_self, $class; } 1;

      Pack is faster, of course. However, mapping two integers into one is not far behind.

      Regards, Mario

        I don't have access to a 64-bit perl. However, this is a 32-bit version of combining two 16 bit integers into a 32 bit number. While this runs as long as the coordinates are within range, I can't test it in the 64-bit version.

        I like it because all offsets are found with a simple addition inside tick.
        All the encode/decode mess is in the input and output routines.

        In theory (completely untested), all that's needed is to change the line

        my $half = 16; # make 32 for 64-bit perls

        to

        my $half = 32; # make 32 for 64-bit perls

        to make it use the full range of two 32 bit numbers.

        So here's the code

        package Organism; use strict; use warnings; sub count { return scalar keys %{ shift->{Cells} }; } # Input a list of [ x, y ] coords sub insert_cells { my $self = shift; my $cells = $self->{Cells}; for my $r (@_) { $cells->{ (($r->[0] + $self->{fudge}) << $self->{half}) | ($r->[1] + $self->{fudge}) } = undef } } # Return sorted list of cells in the Organism. # Used for verification and testing the state of the organism. sub get_live_cells { my $self = shift; sort { $a->[0] <=> $b->[0] || $a->[1] <=> $b->[1] } map { [ ($_ >> $self->{half}) - $self->{fudge}, ($_ & (1 << $self->{half}) - 1) - $self->{fudge} ] } keys %{ $self->{Cells} }; } sub tick { my $self = shift; my $cells = $self->{Cells}; my @deltas = @{ $self->{deltas} }; my ( %new_cells, %dead_cells ); for my $c (keys %{ $cells }) { # Get the (up to 8) dead cells surrounding the cell $dead_cells{$_}++ for my @zcells = grep !exists $cells->{$_}, map $c + $_, @deltas; # Check the live cell # Note: next line equivalent to nlive == 2 || nlive == 3 @zcells == 5 || @zcells == 6 and $new_cells{$c} = undef; } $dead_cells{$_} == 3 and $new_cells{$_} = undef for keys %dead_cell +s; $self->{Cells} = \%new_cells; } sub new { my $class = shift; my $half = 16; # make 32 for 64-bit perls my $base = 1 << $half; my $fudge = $base >> 1; my @deltas = ($base-1, $base, $base+1, -1, 1, -$base-1, -$base, -$base+1); my %init_self = ( Cells => {}, fudge => $fudge, half => $half, deltas => \@deltas ); bless \%init_self, $class; } 1;

        Preliminary testing with 16 bit numbers seemed to show it's about 10% slower than the pack version :(