http://www.perlmonks.org?node_id=1197250

The universe of Conway's Game of Life is an infinite two-dimensional orthogonal grid of square cells, each of which is in one of two possible states, alive or dead. Every cell interacts with its eight neighbours, which are the cells that are horizontally, vertically, or diagonally adjacent. At each step in time, the following transitions occur:

  1. Any live cell with fewer than two live neighbours dies, as if caused by under-population.
  2. Any live cell with two or three live neighbour lives on to the next generation.
  3. Any live cell with more than three live neighbours dies, as if by overcrowding.
  4. Any dead cell with exactly three live neighbours becomes a live cell, as if by reproduction.
The initial pattern constitutes the seed of the system. The first generation is created by applying the above rules simultaneously to every cell in the seed - births and deaths occur simultaneously, and the discrete moment at which this happens is sometimes called a tick (in other words, each generation is a pure function of the preceding one). The rules continue to be applied repeatedly to create further generations.

-- Conway's Game of Life (wikipedia)

I've recently been forced to become acquainted with Conway's Game of Life after being asked to assess various implementations of it submitted by job applicants. Rather than a chore, I actually found it very interesting. Not the GUI stuff, just the guts of the code. How to implement it cleanly, concisely, efficiently.

I wrote a couple of solutions myself, in both Perl and C++11, which I present below. Though still a game of life novice, I've spent quite a bit of time on these solutions already - including some profiling and performance optimization of my C++ solution. My Perl solution is essentially a translation of the one I wrote first in C++.

Having spent time analysing this problem, I'm interested to see how others go about it.

The Challenge

If you want to test yourself, and have a few hours to spare, you might like to have a go (without reading my solutions below) at implementing the above algorithm as a simple Organism class with methods:

For this challenge, you don't need to be truly infinite, but you must handle cells with x and y co-ordinates in the -2 GB to 2 GB range (i.e. 32-bit signed integer x and y co-ordinates). In the interests of keeping the challenge smallish, you don't need to consider graphics, customizations, variations, just implement the four basic rules above.

Performance

I also have a special interest in high performance computing, so hope to learn more about performance and scalability by the feedback I get from this node. I've already sought performance advice related to this challenge in Fastest way to lookup a point in a set.

C++ Solution

Let's start with the Organism class I wrote first, in portable C++11 (tested on both Linux and Windows) - along with a simple performance benchmark program. My Perl solution was derived from this one, so we later compare the performance of the similar C++ and Perl solutions.

// Organism.h // Simple implementation of Conway game of life in standard C++11. #ifndef ORGANISM_H #define ORGANISM_H #include <cstddef> #include <cstdint> #include <vector> #include <unordered_set> #include <algorithm> static_assert(sizeof(size_t) == sizeof(int64_t), "size_t too small, ne +ed a 64-bit compile"); // CELL typedef int32_t cell_coord_type; typedef uint64_t cell_whole_type; typedef int64_t cell_signed_whole_type; #define XX(w) ((cell_coord_type)(w)) #define YY(w) ((cell_coord_type)(((w) >> 32) & 0xFFFFFFFF)) #define WW(x, y) (((cell_signed_whole_type)(y) << 32) | ((cell_signed_ +whole_type)(x) & 0xFFFFFFFF)) struct Cell { cell_coord_type x; cell_coord_type y; Cell(cell_coord_type x_, cell_coord_type y_) : x(x_), y(y_) {} Cell(cell_whole_type w) : x(XX(w)), y(YY(w)) {} }; inline bool operator<(const Cell& lhs, const Cell& rhs) { if (lhs.x < rhs.x) return true; if (lhs.x > rhs.x) return false; return lhs.y < rhs.y; // or one-liner: return lhs.x < rhs.x || (lhs.x == rhs.x && lhs.y < + rhs.y); } inline bool operator>( const Cell& lhs, const Cell& rhs) { return rhs +< lhs; } inline bool operator<=(const Cell& lhs, const Cell& rhs) { return !(lh +s > rhs); } inline bool operator>=(const Cell& lhs, const Cell& rhs) { return !(lh +s < rhs); } inline bool operator==(const Cell& lhs, const Cell& rhs) { return lhs. +x == rhs.x && lhs.y == rhs.y; } inline bool operator!=(const Cell& lhs, const Cell& rhs) { return !(lh +s == rhs); } using cell_list_type = std::vector<Cell>; struct CellHasher { size_t operator()(cell_whole_type w) const { retur +n w; } }; using cell_lookup_type = std::unordered_set<cell_whole_type, CellHashe +r>; // ORGANISM class Organism { public: Organism(size_t ncell = 1000) { cells_m.reserve(ncell); } size_t count() const { return cells_m.size(); } size_t is_alive(cell_whole_type w) const { return cells_m.find(w) ! += cells_m.end(); } size_t is_dead(cell_whole_type w) const { return cells_m.find(w) = += cells_m.end(); } // Used to initialize the starting state of the organism size_t insert_cells(const cell_list_type& cells) { size_t n_inserted = 0; for (const auto& c : cells) { if (cells_m.insert(WW(c.x, c.y)).second) ++n_inserted; } return n_inserted; } // Used for verification and testing the state of the organism cell_list_type get_live_cells() const { cell_list_type vec_items(cells_m.cbegin(), cells_m.cend()); std::sort(vec_items.begin(), vec_items.end()); return vec_items; } // Get the number of live neighbours surrounding a cell size_t get_num_live(cell_coord_type x, cell_coord_type y) const { return is_alive( WW(x-1, y-1) ) + is_alive( WW(x-1, y ) ) + is_alive( WW(x-1, y+1) ) + is_alive( WW(x , y-1) ) + is_alive( WW(x , y+1) ) + is_alive( WW(x+1, y-1) ) + is_alive( WW(x+1, y ) ) + is_alive( WW(x+1, y+1) ); } // Return the number of dead cells surrounding a cell // The cells themselves are returned in z size_t get_dead_cells(cell_coord_type x, cell_coord_type y, cell_wh +ole_type* z) const { size_t n = 0; if (is_dead(WW(x-1, y-1))) z[n++] = WW(x-1, y-1); if (is_dead(WW(x-1, y ))) z[n++] = WW(x-1, y ); if (is_dead(WW(x-1, y+1))) z[n++] = WW(x-1, y+1); if (is_dead(WW(x , y-1))) z[n++] = WW( x, y-1); if (is_dead(WW(x , y+1))) z[n++] = WW( x, y+1); if (is_dead(WW(x+1, y-1))) z[n++] = WW(x+1, y-1); if (is_dead(WW(x+1, y ))) z[n++] = WW(x+1, y ); if (is_dead(WW(x+1, y+1))) z[n++] = WW(x+1, y+1); return n; } void tick() { cell_lookup_type new_cells; size_t ncells = cells_m.size(); new_cells.reserve(ncells + ncells/4); // Stores the (up to 8) dead neighbour cells surrounding a cell cell_whole_type zcells[8]; for (const auto& c : cells_m) { // Get the (up to 8) dead cells surrounding the cell size_t ndead = get_dead_cells(XX(c), YY(c), zcells); // Note: next line equivalent to nlive == 2 || nlive == 3 if (ndead == 5 || ndead == 6) new_cells.insert(c); for (size_t i = 0; i < ndead; ++i) { if (get_num_live(XX(zcells[i]), YY(zcells[i])) == 3) new_c +ells.insert(zcells[i]); } } cells_m = std::move(new_cells); } private: cell_lookup_type cells_m; }; #endif /* ORGANISM_H */

In case you're interested, switching from std::set to std::unordered_set, along with adding the custom CellHasher above, gave a huge performance boost - after profiling indicated that cell lookup was a performance hot spot. While using a single set of live cells is pleasing for its simplicity and unboundedness, its drawback is that counting live neighbours becomes a hash lookup, which is a chronic performance bottleneck.

I'm sure plenty of further performance improvement opportunites are waiting, for example:

  • Write a custom allocator for std::unordered_set (allocate arena on stack if possible)
  • Replace std::unordered_set with a custom sparse set of integers
  • Sort/batch lookups
  • Avoid checking the same dead cell more than once (e.g. create a set union of all dead cells)
  • Find a better organism data structure: speed up checking cell state and counting neighbours (e.g. via a bitmap), keep track of which zones are active to save time by not updating inactive zones (a cell that did not change at the last time step, and none of whose neighbours changed, is guaranteed not to change at the current time step as well), detect repetitive patterns (e.g. blinker), ... Update: this has kinda been done now in More Betterer Game of Life.
  • Memoization, cache optimization, compiler intrinsics and other tricks discussed in this node
  • Intel TBB (Thread Building Blocks)
  • OpenCL (GPGPU)
Further suggestions welcome.

A simple benchmark program using the blinker oscillator:

// tbench1.cpp. Simple benchmark of Organism class. // g++ compile on Linux: // g++ -o tbench1 -std=c++11 -Wall -O3 tbench1.cpp // MSVC 2017 compile on Windows, start a 64-bit compiler cmdline via: // Start > AllApps > Visual Studio 2017 > x64 Native Tools Command P +rompt for VS2017 // then: // cl /W3 /MT /Ox /EHsc tbench1.cpp #include <cstdlib> #include <ctime> #include <string> #include <iostream> #include <fstream> #include <sstream> #include "Organism.h" static bool is_file_exist(const char *fname) { std::ifstream infile(fname); return infile.good(); } static Cell read_cell(const std::string& str) { cell_coord_type x, y; std::istringstream iss(str); iss >> x; iss >> y; return Cell(x, y); } // Reads a Life 1.06 text file static cell_list_type read_cells_106(const char* fname) { cell_list_type cells; std::ifstream cellfile(fname); if (!cellfile) { std::cerr << "Error opening '" << fname << "'\n"; return cells; } for (std::string line; std::getline(cellfile, line); ) { if (line.empty()) continue; // ignore blank lines if (line[0] == '#') continue; // ignore comment lines cells.push_back(read_cell(line)); } return cells; } static void RunTest(const char* fname, int nticks) { Organism org; { cell_list_type cells = read_cells_106(fname); org.insert_cells(cells); size_t ncells = org.count(); std::cout << "cell count at start = " << ncells << "\n"; if (ncells != cells.size()) { std::cerr << "oops cell count mism +atch" << "\n"; } } std::cerr << "run benchmark for " << nticks << " ticks\n"; time_t tstart = ::time(NULL); for (int i = 1; i <= nticks; ++i ) { org.tick(); } time_t tend = ::time(NULL); long taken = static_cast<long>(::difftime(tend, tstart) + 0.5); size_t ncells = org.count(); std::cout << "cell count at end = " << ncells << "\n"; std::cerr << "time taken " << taken << " secs\n"; } int main(int argc, char* argv[]) { if (argc != 3) { std::cerr << "usage: tbench1 file nticks\n"; return 1; } const char* fname = argv[1]; if (!is_file_exist(fname)) { std::cerr << "File '" << fname << "' does not exist\n"; return 1; } int nticks = ::atoi(argv[2]); if (nticks <= 0) { std::cerr << "'" << argv[2] << "'" << " invalid nticks\n"; return 1; } RunTest(fname, nticks); return 0; }

Perl Solution

Finally, here is my derived Perl version, Organism.pm:

package Organism; use strict; use warnings; sub count { my $self = shift; return scalar keys %{ $self->{Cells} }; } sub is_alive { my $self = shift; return 0 + exists $self->{Cells}->{ join ':', @_ }; } # Input a list of [ x, y ] coords sub insert_cells { my $self = shift; for my $r (@_) { $self->{Cells}->{ join ':', @{$r} } = undef } } # Return sorted list of cells in the Organism. # Used for verification and testing the state of the organism. sub get_live_cells { my $self = shift; sort { $a->[0] <=> $b->[0] || $a->[1] <=> $b->[1] } map { [ split /:/, $_ ] } keys %{ $self->{Cells} }; } # Return the list of dead cells surrounding a cell sub get_dead_cells { my ( $self, $x, $y ) = @_; ( (join ':', $x - 1, $y - 1) x !$self->is_alive($x - 1, $y - 1), (join ':', $x - 1, $y ) x !$self->is_alive($x - 1, $y ), (join ':', $x - 1, $y + 1) x !$self->is_alive($x - 1, $y + 1), (join ':', $x , $y - 1) x !$self->is_alive($x , $y - 1), (join ':', $x , $y + 1) x !$self->is_alive($x , $y + 1), (join ':', $x + 1, $y - 1) x !$self->is_alive($x + 1, $y - 1), (join ':', $x + 1, $y ) x !$self->is_alive($x + 1, $y ), (join ':', $x + 1, $y + 1) x !$self->is_alive($x + 1, $y + 1) ); } sub get_num_live { my ( $self, $x, $y ) = @_; $self->is_alive( $x - 1, $y - 1 ) + $self->is_alive( $x - 1, $y ) + $self->is_alive( $x - 1, $y + 1 ) + $self->is_alive( $x , $y - 1 ) + $self->is_alive( $x , $y + 1 ) + $self->is_alive( $x + 1, $y - 1 ) + $self->is_alive( $x + 1, $y ) + $self->is_alive( $x + 1, $y + 1 ); } sub tick { my $self = shift; my %new_cells; for my $c (keys %{ $self->{Cells} }) { # Get the (up to 8) dead cells surrounding the cell my @zcells = $self->get_dead_cells( split /:/, $c ); # Check the live cell # Note: next line equivalent to nlive == 2 || nlive == 3 @zcells == 5 || @zcells == 6 and $new_cells{$c} = undef; # Check the dead cells for my $z (@zcells) { $self->get_num_live( split /:/, $z ) == 3 and $new_cells{$z} += undef; } } $self->{Cells} = \%new_cells; } sub new { my $class = shift; my %init_self = ( Cells => {} ); bless \%init_self, $class; } 1;

And here is the simple blinker benchmark program:

# tbench1.pl - Simple benchmark of Organism class. # Generate blinker test file with, for example: # perl createblinker.pl 500000 -900000 100 >x.tmp 2>y.tmp # then run this script for two ticks, for example: # perl tbench1.pl x.tmp 2 use strict; use warnings; use Organism; # XXX: To read Life 1.06 file, ignore leading line containing "#Life 1 +.06" sub read_cells { my $fname = shift; open( my $fh, '<', $fname ) or die "error: open '$fname': $!"; map { [ split ' ' ] } <$fh>; } @ARGV == 2 or die "usage: $0 file nticks\n"; my $file = shift; my $nticks = shift; $nticks =~ /^\d+$/ or die "error: nticks ($nticks) not a number"; my $org = Organism->new(); { my @cells = read_cells($file); $org->insert_cells(@cells); my $ncells = $org->count(); print "cell count at start = $ncells\n"; $ncells == scalar(@cells) or die "oops"; } warn "run benchmark for $nticks ticks\n"; my $tstart = time; for my $i ( 1 .. $nticks ) { $org->tick(); } my $tend = time; my $taken = $tend - $tstart; my $ncells = $org->count(); print "cell count at end = $ncells\n"; warn "time taken: $taken secs\n";

Along with createblinker.pl to generate the test data:

# Create blinker test data for load testing, e.g.: # perl createblinker.pl 500000 -100000 10 >x.tmp 2>y.tmp # creates 500,000 * 3 blinker cells around y=10 starting at x=-100000 use strict; use warnings; sub blink { my $x = shift; my $y1 = shift; my $y2 = $y1 + 1; my $y3 = $y1 + 2; warn "$x $y1\n$x $y2\n$x $y3\n"; } @ARGV == 3 or die "usage: $0 n xstart ystart\n"; my $n = shift; my $xstart = shift; my $ystart = shift; $n =~ /^\d+$/ or die "error: n not numeric"; $xstart =~ /^[-]?\d+$/ or die "error: xstart not numeric"; $ystart =~ /^[-]?\d+$/ or die "error: ystart not numeric"; my $x = $xstart; my $y = $ystart; my $gap = 3; # $gap must be >= 3 for my $i ( 0 .. $n-1 ) { my $x1 = $x + $i; my $x2 = $x1 + 1; my $x3 = $x1 + 2; print "$x1 $y\n$x2 $y\n$x3 $y\n"; blink( $x2, $y-1 ); $x += $gap; }

And a simple unit test:

# tgol2.t - Simple blinker test of Conway Game of Life Organism class # Generate file blinker.txt with: # perl createblinker.pl 3 100 10 >xtest.tmp 2>ytest.tmp # then run this test with: # prove -v tgol2.t use strict; use warnings; use Organism; use Test::More; my $nblinks = 5; my $ntests = ( $nblinks + 1 ) * 3; plan tests => $ntests; # XXX: To read Life 1.06 file, ignore leading line containing "#Life 1 +.06" sub read_cells { my $fname = shift; open( my $fh, '<', $fname ) or die "error: open '$fname': $!"; map { [ split ' ' ] } <$fh>; } sub test_one { my $org = shift; my $desc = shift; my $expected = shift; my $nexpected = @{$expected}; my $ncells = $org->count(); my @cells = $org->get_live_cells(); cmp_ok( $ncells, '==', $nexpected, "$desc cell count ($ncells)" ); cmp_ok( scalar(@cells), '==', $nexpected, "$desc cell array count" +); is_deeply( \@cells, $expected, "$desc cell array" ); } # Blinker pattern my @blinker1 = read_cells('xtest.tmp'); my @blinker2 = read_cells('ytest.tmp'); my @sblinker1 = sort { $a->[0] <=> $b->[0] || $a->[1] <=> $b->[1] } @b +linker1; my @sblinker2 = sort { $a->[0] <=> $b->[0] || $a->[1] <=> $b->[1] } @b +linker2; # Initialization my $org = Organism->new(); $org->insert_cells(@sblinker1); test_one( $org, "initial", \@sblinker1 ); # Pattern should just blink back and forth for my $i ( 1 .. $nblinks ) { $org->tick(); test_one( $org, "blinker $i", $i % 2 ? \@sblinker2 : \@sblinker1 ); }

Update: Extra Test Program tgol.t

I originally forgot to mention that the following little test tgol.t should be run to catch any errors with negative x and y values before firing off any benchmarks:

# tgol.t - Simple blinker test of Conway Game of Life Organism class use strict; use warnings; use Organism; use Test::More; my $nblinks = 5; my $ntests = ( $nblinks + 1 ) * 3; plan tests => $ntests; sub test_one { my $org = shift; # Organism handle my $desc = shift; # Test description my $expected = shift; # Array ref of (sorted) expected cells my $nexpected = @{$expected}; my $ncells = $org->count(); my @cells = $org->get_live_cells(); cmp_ok( $ncells, '==', $nexpected, "$desc cell count ($ncells)" ); cmp_ok( scalar(@cells), '==', $nexpected, "$desc cell array count" +); is_deeply( \@cells, $expected, "$desc cell array" ); } # Blinker pattern my @blinker1 = ( [ -101, -100 ], [ -100, -100 ], [ -99, -100 ], [ -101, 100 ], [ -100, 100 ], [ -99, 100 ], [ -1, 0 ], [ 0, 0 ], [ 1, 0 ], [ 99, -100 ], [ 100, -100 ], [ 101, -100 ], [ 99, 100 ], [ 100, 100 ], [ 101, 100 ], ); my @blinker2 = ( [ -100, -99 ], [ -100, -100 ], [ -100, -101 ], [ -100, 99 ], [ -100, 100 ], [ -100, 101 ], [ 0, -1 ], [ 0, 0 ], [ 0, 1 ], [ 100, -99 ], [ 100, -100 ], [ 100, -101 ], [ 100, 99 ], [ 100, 100 ], [ 100, 101 ], ); my @sblinker1 = sort { $a->[0] <=> $b->[0] || $a->[1] <=> $b->[1] } @b +linker1; my @sblinker2 = sort { $a->[0] <=> $b->[0] || $a->[1] <=> $b->[1] } @b +linker2; # Initialization my $org = Organism->new(); $org->insert_cells(@blinker1); test_one( $org, "initial", \@sblinker1 ); # Pattern should just blink back and forth for my $i ( 1 .. $nblinks ) { $org->tick(); test_one( $org, "blinker $i", $i % 2 ? \@sblinker2 : \@sblinker1 ); }

Further update: Another little lidka test program tgol3.t should be run to further test your solution.

CPAN Offerings

From a quick look at the CPAN namely:

Game::Life::Infinite::Board seemed to be the best, so, for cheap thrills, I wrote a benchmark of that module too (based on the earlier one for Organism.pm):
# tbench1-infinite.pl - Simple benchmark. # Uses Game::Life::Infinite::Board use strict; use warnings; use Game::Life::Infinite::Board; sub read_lines { my $fname = shift; open( my $fh, '<', $fname ) or die "error: open '$fname': $!"; my @lines = <$fh>; return @lines; } @ARGV == 2 or die "usage: $0 file nticks\n"; my $file = shift; my $nticks = shift; $nticks =~ /^\d+$/ or die "error: nticks ($nticks) not a number"; my $org = Game::Life::Infinite::Board->new(); { my @lines = read_lines($file); $org->loadL106(\@lines); my $ncells = $org->statistics->{'liveCells'}; print "cell count at start = $ncells\n"; $ncells == scalar(@lines) or die "oops"; } warn "run benchmark for $nticks ticks\n"; my $tstart = time; for my $i ( 1 .. $nticks ) { $org->tick(); } my $tend = time; my $taken = $tend - $tstart; my $ncells = $org->statistics->{'liveCells'}; print "cell count at end = $ncells\n"; warn "time taken: $taken secs\n";

Benchmark Results

Benchmark timings for running two ticks on my machine were:
Version375K cells750K cells1.5 million cells3 million cells
C++0 secs0 secs1 secs2 secs
Organism.pm (Mario improvements)13 secs26 secs52 secs108 secs
Organism.pm (Original above)35 secs70 secs141 secs284 secs
Game::Life::Infinite:Board37 secs96 secs273 secs905 secs
Update: the timings above were updated from perl v5.24.0 to perl v5.26.0 (which was very slightly faster).

As for memory use, the maximum Windows Private Bytes used for the three million cell case by each process was:

  • C++ : 517,340K
  • Organism.pm (Original above): 1,455,004K
  • Organism.pm (Mario improvements): 1,596,368K
  • Game::Life::Infinite:Board: 18,138,504K

Update: Benchmark timings running AppleFritter's Lidka test for 30,000 ticks were:
VersionLidka 30,000 ticks
C++36 secs
Organism.pm (Mario improvements)450 secs
Organism.pm (Original above)1635 secs
Game::Life::Infinite:Board640 secs
Note that Game::Life::Infinite::Board performed much better on this test, presumably due to the significantly lower memory requirements - the cell count in the Lidka benchmark remains under 2000 for all 30,000 ticks.

References

Updated: Added more references. Minor improvements to wording. Added Mario performance enhanced Organism.pm to the table of benchmark results. Added extra tgol.t test program in a new section. Another extra lidka test program tgol3.t. Updated tbench1.cpp to read a Life 1.06 text file. Added Lidka Performance 30,000 test to Benchmark Results section; also updated timings from perl v5.24.0 to perl v5.26.0.