in reply to Re^10: High Performance Game of Life
in thread High Performance Game of Life
Update: Replaced bitOR with addition to have $x and $y line up for better readability.
This post is a fun study, comparing pack 'i2' against the mapping of two integers into one via bit manipulation. A use case for doing this is wanting readable keys for storing into a database using one field. This laptop runs an i7 Haswell at 2.6 GHz. Unfortunately, I do not have anything slower to run on.
bin/perl v5.26.0
$ perl createblinker.pl 500000 900000 100 >x.tmp 2>y.tmp $ /opt/perl5.26.0/bin/perl I. tbench1.pl x.tmp 2 # pack i2 cell count at start = 1500000 run benchmark for 2 ticks cell count at end = 1500000 time taken: 37 secs time taken: 58 secs < 32bit Windows VM $ /opt/perl5.26.0/bin/perl I. tbench1.pl x.tmp 2 # mapping cell count at start = 1500000 run benchmark for 2 ticks cell count at end = 1500000 time taken: 39 secs time taken: 64 secs < 32bit Windows VM, applied 32bit tip below
bin/cperl v5.24.3c
$ /opt/cperl5.24.3c/bin/cperl I. tbench1.pl x.tmp 2 # pack i2 cell count at start = 1500000 run benchmark for 2 ticks cell count at end = 1500000 time taken: 37 secs $ /opt/cperl5.24.3c/bin/cperl I. tbench1.pl x.tmp 2 # mapping cell count at start = 1500000 run benchmark for 2 ticks cell count at end = 1500000 time taken: 38 secs
I used tybalt89's update and applied the mapping logic. All tests pass, thanks to new test script by eyepopslikeamosquito. Please ensure Perl is compiled with 64bit support. 16 bits hold the value for $y and 2 bits for whether $x,$y are less than 0. $x is stored in the remaining bits. This results in minimum key lenght as $y isn't big. Though, adjust accordingly the number of bits to shift and bitmask if necessary.
bits 6318 contains the $x value bits 172 contains the $y value bit 1 set when $y < 0 bit 0 set when $x < 0
On 32bit hardware, replace 18 and 0xFFFF with 10 and 0xFF throughout the module.
bits 3110 contains the $x value bits 92 contains the $y value bit 1 set when $y < 0 bit 0 set when $x < 0
Both _unpack and _pack are inlined inside tick for maximum performance.
package Organism; use strict; # use warnings; sub _pack { my ( $x, $y ) = @_; # bits 0,1 negative flag for $x,$y respectively return ( abs($x) << 18 ) + ( $x < 0 ? 1 : 0 ) + ( abs($y) << 2 ) + ( $y < 0 ? 2 : 0 ); } sub _unpack { my ( $n ) = @_; # bits 0,1 negative flag for $x,$y respectively return ( $n & 0x1 ? ($n >> 18 ) : $n >> 18, $n & 0x2 ? ($n >> 2 & 0xFFFF) : $n >> 2 & 0xFFFF ); } sub count { return scalar keys %{ shift>{Cells} }; } # Input a list of [ x, y ] coords sub insert_cells { my $cells = shift>{Cells}; for my $r (@_) { $cells>{ _pack @{$r} } = undef } } # Return sorted list of cells in the Organism. # Used for verification and testing the state of the organism. sub get_live_cells { sort { $a>[0] <=> $b>[0]  $a>[1] <=> $b>[1] } map { [ _unpack $_ ] } keys %{ shift>{Cells} }; } sub tick { my $self = shift; my $cells = $self>{Cells}; my ( $k1, $k2, $k3, $k4, $k5, $k6, $k7, $k8, $x0, $x1, $x2, $y0, $y1, $y2, %new_cells, %dead_cells ); for my $c (keys %{ $cells }) { # Get the (up to 8) dead cells surrounding the cell ( $x0, $y0 ) = ( $c & 0x1 ? ($c >> 18 ) : $c >> 18, $c & 0x2 ? ($c >> 2 & 0xFFFF) : $c >> 2 & 0xFFFF ); ( $x1, $x2, $y1, $y2 ) = ( $x0  1, $x0 + 1, $y0  1, $y0 + 1 ); $dead_cells{$_}++ for my @zcells = ( ( $k1 = ( abs($x1) << 18 ) + ( $x1 < 0 ? 1 : 0 ) + ( abs($y1) << 2 ) + ( $y1 < 0 ? 2 : 0 ) ) x !(exists $cells>{$k1}), ( $k2 = ( abs($x1) << 18 ) + ( $x1 < 0 ? 1 : 0 ) + ( abs($y0) << 2 ) + ( $y0 < 0 ? 2 : 0 ) ) x !(exists $cells>{$k2}), ( $k3 = ( abs($x1) << 18 ) + ( $x1 < 0 ? 1 : 0 ) + ( abs($y2) << 2 ) + ( $y2 < 0 ? 2 : 0 ) ) x !(exists $cells>{$k3}), ( $k4 = ( abs($x0) << 18 ) + ( $x0 < 0 ? 1 : 0 ) + ( abs($y1) << 2 ) + ( $y1 < 0 ? 2 : 0 ) ) x !(exists $cells>{$k4}), ( $k5 = ( abs($x0) << 18 ) + ( $x0 < 0 ? 1 : 0 ) + ( abs($y2) << 2 ) + ( $y2 < 0 ? 2 : 0 ) ) x !(exists $cells>{$k5}), ( $k6 = ( abs($x2) << 18 ) + ( $x2 < 0 ? 1 : 0 ) + ( abs($y1) << 2 ) + ( $y1 < 0 ? 2 : 0 ) ) x !(exists $cells>{$k6}), ( $k7 = ( abs($x2) << 18 ) + ( $x2 < 0 ? 1 : 0 ) + ( abs($y0) << 2 ) + ( $y0 < 0 ? 2 : 0 ) ) x !(exists $cells>{$k7}), ( $k8 = ( abs($x2) << 18 ) + ( $x2 < 0 ? 1 : 0 ) + ( abs($y2) << 2 ) + ( $y2 < 0 ? 2 : 0 ) ) x !(exists $cells>{$k8}) ); # Check the live cell # Note: next line equivalent to nlive == 2  nlive == 3 @zcells == 5  @zcells == 6 and $new_cells{$c} = undef; } $dead_cells{$_} == 3 and $new_cells{$_} = undef for keys %dead_cell +s; $self>{Cells} = \%new_cells; } sub new { my $class = shift; my %init_self = ( Cells => {} ); bless \%init_self, $class; } 1;
Pack is faster, of course. However, mapping two integers into one is not far behind.
Regards, Mario


Replies are listed 'Best First'.  

Re^12: High Performance Game of Life (updated)
by tybalt89 (Monsignor) on Aug 15, 2017 at 00:13 UTC  
by marioroy (Prior) on Aug 15, 2017 at 01:38 UTC  
by tybalt89 (Monsignor) on Aug 15, 2017 at 01:59 UTC  
by marioroy (Prior) on Aug 15, 2017 at 02:12 UTC  
by marioroy (Prior) on Aug 15, 2017 at 05:30 UTC  
 
by tybalt89 (Monsignor) on Aug 15, 2017 at 02:23 UTC 