kakuro puzzle generator

davidj has asked for the wisdom of the Perl Monks concerning the following question:

My fellow monks,

I got really bored at work a couple nights ago. (12 hour shifts can be excruciatingly long, especially where there isn't much that has to be done.) Anyway, I decided to pass the time seeing if I could hack together a kakuro (cross sums) puzzle generator. Its a crossword-like puzzle that uses numbers instead of words with 2 constraints: 1) each number must be part of a vertical and horizontal sequence, 2) only the numbers 1-9 can be used, without repeating.

Anyway, this is what I came up with. I know its not pretty: I violate the rules regarding global variable, strict and warnings, etc. But what do you expect after a 1 night hack? I'm sure it's less than efficient. And I have no doubt that most of you could write one that will put mine to shame.

I just wanted to put it out there for others to enjoy and to see what kind of solutions you might or might have already come up with.

The generator is passed 2 parameters: grid dimensions and coverage required. Its currenlty coded for NxN grids.

andromeda:davidj sums > cat sums.pl
#!/usr/bin/perl

# puzzles parameters
my $dim = $ARGV[0];
my $coverage = $ARGV[1];
# creation flags
my ($finished,$create_attempts,$total_locs,$num_filled) = (0,0,0,0);
# left column and top row totals
my (@acr_ar, @down_ar) = ((), ());
# hashes used to test for connected graph
my (%list, %visited);
# hashes to track sequence information
my (%across, %down, %intersect);
# arrays used to hold order of longest sequences in each direction
my (@across_order, @down_order);
# main grid
my @g;

# since this uses a brute force method
# there is a chance of failure. Try 5 and bail
until($finished || $create_attempts == 5) {
    my ($complete,$attempts) = (0,0);
    until ($complete) {
        @g = ();
        &init_grid() && &fill_grid();
        next unless &check_coverage() && &check_grid();
        $complete = &test_connected();
    }
    &set_across() && &set_down() && &set_intersect();
    $finished = &set_numbers() until $finished || $attempts++ == 5;
    $create_attempts++;
}
if($finished) {
    @down_ar = map {' ' x 7} (1 .. $dim);
    @acr_ar = map {' ' x 7} (1 .. $dim);
    &set_totals();
    &display_puzzle();
    print "\n\n";
    &display_solution();
} else {
    print "could not create puzzle\n";
}

# initial grid to all '-'
sub init_grid() {
    for(my $i = 0; $i < $dim; $i++) {
        for(my $j = 0; $j < $dim; $j++) {
            $g[$i][$j] = '-';
        }
    }
    return 1;
}

# fill the grid with '@' markers that will be later filled with number
+s
sub fill_grid() {
    my $c = $dim;
    for(my $i = 0; $i < $dim-1; $i++) {
        for(my $j = 0; $j < $c-1; $j++) {
            if (int(rand(4)) % 4 == 0) {
                $g[$i][$j] = $g[$i][$j+1] = $g[$i+1][$j] = $g[$i+1][$j
++1] = '@';
                my $ni = ($dim - 2) - $i;
                my $nj = ($dim - 2) - $j;
                $g[$ni][$nj] = $g[$ni][$nj+1] = $g[$ni+1][$nj] = $g[$n
+i+1][$nj+1] = '@';
            } else {
                $g[$i][$j] = $g[($dim-1)-$i][($dim-1)-$j] = "-" unless
+ $g[$i][$j] eq '@';
            }
        }
        $c--;
    }
    return 1;
}

# make sure the coverage threshold is met (grab total coverage at same
+ time)
sub check_coverage () {
    $total_locs = 0;
    for(my $i = 0; $i < $dim; $i++) {
        for(my $j = 0; $j < $dim; $j++) {
            $total_locs++ if $g[$i][$j] eq "@";
        }
    }
    return ($total_locs / ($dim * $dim)) >= $coverage ? 1 : 0;
}

# make sure there are no sequences greater than length 9
# we need to do this since one of the constraints of the
# puzzle is that the numbers 1-9 cannot be repeated in
# a sequence
sub check_grid() {
    my ($countH, $countV) = (0,0);
    for(my $i = 0; $i < $dim; $i++) {
        for(my $j = 0; $j < $dim; $j++) {
            $g[$i][$j] eq "@" ? $countH++ : ($countH = 0);
            $g[$j][$i] eq "@" ? $countV++ : ($countV = 0);
            return 0 if ($countH > 9 or $countV > 9);
        }
        $countH = $countV = 0;
    }
    return 1;
}

# depth first search to test for connectivity
sub DFS () {
    my $v = shift;
    my @links;
    $visited{$v} = "true";
    @links = sort {$a <=> $b} (@{ $list{$v} });
    foreach my $link ( @links ) {
        &DFS($link) if $visited{$link} eq "false";
    }
}

# self-explanatory
sub display_solution () {
    my $num = ($dim+1) * 8;
    print "\n" . "-" x $num . "\n";
    print '|       ';
    for(my $i = 0; $i <= $dim; $i++) {
        print '|' . $acr_ar[$i];
    }
    print "\n" . "-" x $num . "\n";
    for(my $i = 0; $i < $dim; $i++) {
        print '|' . $down_ar[$i] . '|';
        for(my $j = 0; $j < $dim; $j++) {
            printf("%8s", $g[$i][$j] . '  |');
        }
        print  "\n" . '-' x $num . "\n";
    }
}

# self-explanatory
sub display_puzzle() {
    my $num = ($dim+1) * 8;
    print "\n" . "-" x $num . "\n";
    print '|       ';
    for(my $i = 0; $i <= $dim; $i++) {
        print '|' . $acr_ar[$i];
    }
    print "\n" . "-" x $num . "\n";
    for(my $i = 0; $i < $dim; $i++) {
        print '|' . $down_ar[$i] . '|';
        for(my $j = 0; $j < $dim; $j++) {
            if($g[$i][$j] =~ m/[:-]/) {
                printf("%8s", $g[$i][$j] . '  |');
            } else {
                printf("%8s", ' ' . '  |');
            }
        }
        print  "\n" . '-' x $num . "\n";
    }
}

# get the list of numbers that are not used in the
# row and column that intersect the current location
sub get_unused() {
    my ($aref,$bref,$dref) = @_;
    my (@ar, %seen);
    foreach my $elm ( @{$aref},@{$bref} ) {
        $seen{$elm} = 1;
    }
    foreach my $elm ( @{$dref} ) {
        push @ar, $elm unless $seen{$elm};
    }
    return @ar;
}

# create list of all connecting locations.
# locations are connected if an adjacent n,s,e, or w
# location also has a number in it
sub make_list () {
    %list = ();
    for(my $i = 0; $i < $dim; $i++) {
        for(my $j = 0; $j < $dim; $j++) {
            if($g[$i][$j] eq "@") {
                push(@{ $list{($i*$dim)+$j} }, (($i-1)*$dim)+$j) if ($
+i>0 and ($g[$i-1][$j] eq '@'));
                push(@{ $list{($i*$dim)+$j} }, (($i+1)*$dim)+$j) if $g
+[$i+1][$j] eq "@";
                push(@{ $list{($i*$dim)+$j} }, ($i*$dim)+($j-1)) if ($
+j>0 and ($g[$i][$j-1] eq '@'));
                push(@{ $list{($i*$dim)+$j} }, ($i*$dim) + ($j+1)) if 
+$g[$i][$j+1] eq "@";
            }
        }
    }
    return 1;
}

# get a hash of all row sequence starting locations and all
# the adjacent locations that make up the sequence.
# also get the size of the sequence. We use this later
# in determining which sequences to fill first
sub set_across () {
    %across = ();
    for(my $i = 0; $i < $dim; $i++) {
        for(my $j = 0; $j < $dim; $j++) {
            if($g[$i][$j] eq "@") {
                my @locs = ();
                my $start = ($i * $dim) + $j;
                push(@locs, ($i * $dim) + $j++) while($g[$i][$j] eq "@
+");
                $across{$start}->{locs} = [ @locs ];
            }
        }
    }
    @across_order =  reverse sort { scalar(@{ $across{$a}->{locs} }) <
+=> scalar(@{ $across{$b}->{locs} }) } keys %across;
    return 1;
}

# same as set_across, except for columnar sequences
sub set_down () {
    %down = ();
    for(my $j = 0; $j < $dim; $j++) {
        for(my $i = 0; $i < $dim; $i++) {
            if($g[$i][$j] eq "@") {
                my @locs = ();
                my $start = ($i * $dim) + $j;
                push(@locs, ($i++ * $dim) + $j) while($g[$i][$j] eq "@
+");
                $down{$start}->{locs} = [ @locs ];
            }
        }
    }
    @down_order =  reverse sort { scalar(@{ $down{$a}->{locs} }) <=> s
+calar(@{ $down{$b}->{locs} }) } keys %down;
    return 1;
}

# determine the across and down sequences that intersect
# at each location to be filled. This will be needed to
# guarantee each sequence contains unique numbers
sub set_intersect () {
    my ($aw, $dw);
    %intersect = ();
    for(my $i = 0; $i < $dim; $i++) {
        for(my $j = 0; $j < $dim; $j++) {
            if($g[$i][$j] eq "@") {
                my $loc = ($i * $dim) + $j;
                foreach my $apos (keys %across) {
                    if(grep { $_ == $loc }  @{ $across{$apos}->{locs} 
+}) {
                        $aw = $apos;
                        last;
                    }
                }
                foreach my $dpos (keys %down) {
                    if(grep { $_ == $loc }  @{ $down{$dpos}->{locs} })
+ {
                        $dw = $dpos;
                        last;
                    }
                }
                $intersect{$loc} = [$aw, $dw];
            }
        }
    }
    return 1;
}

# fill the sequences with unique numbers
sub set_numbers () {
    my ($cur, $cur_a, $cur_d, $cur_loc, $dir, $aw, $dw, $nd, $nl);
    my (@grid_locs, @unused);
    my @digits = (1 .. 9);

    #initialize variables
    @grid_locs = @unused = ();
    $num_filled = 0;

    # use copies of data structures since we may need to
    # attempt this more than once
    @tmp_grid = @g;
    %tmp_across = %across;
    %tmp_down = %down;
    @tmp_across_order = @across_order;
    @tmp_down_order = @down_order;

    # we loop until all locations have been filled
    while( $total_locs != $num_filled) {
        # get the sequence with the greatest number of unfilled locati
+ons.
        # we do this because it will be the hardest to fill
        @grid_locs = ();
        while(1) {
            if(scalar(@{ $tmp_across{$tmp_across_order[0]}->{locs} }) 
+>= scalar(@{ $tmp_down{$tmp_down_order[0]}->{locs} }) ) {
                $cur = shift @tmp_across_order;
                ($nl, $nd) = ( scalar(@{ $tmp_across{$cur}->{locs} }),
+ scalar(@{ $tmp_across{$cur}->{nums} }) );
                $dir = "a";
            } else {
                $cur = shift @tmp_down_order;
                ($nl, $nd) = (scalar(@{ $tmp_down{$cur}->{locs} }), sc
+alar(@{ $tmp_down{$cur}->{nums} }) );
                $dir = "d";
            }
            last if $nl > $nd;
        }
        # get array of grid locations of retreived sequence to place n
+umbers in
        @grid_locs = ($dir eq "a" ? @{ $tmp_across{$cur}->{locs} } : @
+{ $tmp_down{$cur}->{locs} });
        #iterate over the locations and place numbers
        while(scalar(@grid_locs) > 0) {
            @unused = ();
            $cur_loc = shift @grid_locs;
            # if the current location is already filled we can skip it
            ($cur_a, $cur_d) = ( (int($cur_loc / $dim)), ($cur_loc % $
+dim) );
            next if $tmp_grid[$cur_a][$cur_d] ne "@";
            # get the row and column sequences that intersect at this 
+location
            # we need this information becuase of the unique number co
+nstraint
            # get the numbers that have NOT been used in either sequen
+ce.
            # fail if all numbers have been used
            ($aw, $dw) = (@{ $intersect{$cur_loc} });
            @unused = &get_unused(\@{$tmp_across{$aw}->{nums}}, \@{$tm
+p_down{$dw}->{nums}}, \@digits);
            return 0 if scalar(@unused) == 0;
            $number = $unused[ int(rand(scalar(@unused))) ];
            # store number in grid location.
            # also, add to the list of numbers used and add to running
+ totals
            $tmp_grid[$cur_a][$cur_d] = $number;
            push( @{ $tmp_across{$aw}->{nums} }, $number);
            $tmp_across{$aw}->{total} += $number;
            push( @{ $tmp_down{$dw}->{nums} }, $number);
            $tmp_down{$dw}->{total} += $number;
            $num_filled++;
        }
    }

    # if we get this far then all is good, so copy back
    @g = @tmp_grid;
    %across = %tmp_across;
    %down = %tmp_down;
    @across_order = @tmp_across_order;
    @down_order = @tmp_down_order;
    return 1;
}

# used for display purposes. enters the sequence totals on the grid.
sub set_totals () {
    foreach my $key (sort keys %down) {
        my ($x,$y) = ( int($key/$dim) - 1, $key % $dim );
        if($key < $dim) {
            $acr_ar[$key] = ' ' x (4 - length($down{$key}->{total})) .
+ $down{$key}->{total} . ':  ';
        }
        if($x >= 0) {
            $g[$x][$y] = $down{$key}->{total} . ':';
        }
    }
    foreach my $key (sort keys %across) {
        my ($x,$y) = ( int($key/$dim), ($key % $dim) -1 );
        if ($y >= 0) {
            $g[$x][$y] =~ s/-//g;
            $g[$x][$y] .= ":" . $across{$key}->{total};
            $g[$x][$y] =~ s/::/:/;
        } else {
            $down_ar[$x] = '  :' . $across{$key}->{total} . ' ' x (4 -
+ length($across{$key}->{total}));
        }
    }
}

# one constraint of the puzzle is that it must be 'connected'
# connected means that each square containing a number is
# part of an across sequence and down sequence
# uses dfs to test for connectivity
sub test_connected () {
    my $searches = 0;
    my @vertices = ();
    &make_list();
    @vertices = sort {$a <=> $b} (keys %list);
    %visited = map { $_ => "false" } @vertices;
    foreach my $vertex (@vertices) {
        if($visited{$vertex} eq "false") {
            $searches++;
            &DFS($vertex);
        }
    }
    return $searches > 1 ? 0 : 1;
}
[download]

sample output on an 7x7 grid with 60% coverage required. The clues are given in num:num format. The number to the left of the ':' is the sum of the sequence below. The number to the right of the ':' is the sum of the sequence to the right. Sequences consist of empty cells. Cells with '-' in them are not used.

andromeda:davidj sums > perl sums.pl 7 .6

----------------------------------------------------------------
|       |  10:  |  23:  |       |  12:  |  31:  |       |       |
----------------------------------------------------------------
|  :12  |       |       | 37:9  |       |       |  36:  |   9:  |
----------------------------------------------------------------
|  :30  |       |       |       |       |       |       |       |
----------------------------------------------------------------
|       |  :36  |       |       |       |       |       |       |
----------------------------------------------------------------
|       | 8:14  |       |       |19:14  |       |       |    -  |
----------------------------------------------------------------
|  :25  |       |       |       |       |       |       |  11:  |
----------------------------------------------------------------
|  :39  |       |       |       |       |       |       |       |
----------------------------------------------------------------
|       |    -  |  :10  |       |       |   :7  |       |       |
----------------------------------------------------------------

SOLUTION

----------------------------------------------------------------
|       |  10:  |  23:  |       |  12:  |  31:  |       |       |
----------------------------------------------------------------
|  :12  |    9  |    3  | 37:9  |    3  |    6  |  36:  |   9:  |
----------------------------------------------------------------
|  :30  |    1  |    4  |    7  |    2  |    3  |    8  |    5  |
----------------------------------------------------------------
|       |  :36  |    2  |    8  |    7  |    9  |    6  |    4  |
----------------------------------------------------------------
|       | 8:14  |    5  |    9  |19:14  |    5  |    9  |    -  |
----------------------------------------------------------------
|  :25  |    5  |    1  |    6  |    2  |    7  |    4  |  11:  |
----------------------------------------------------------------
|  :39  |    3  |    8  |    5  |    9  |    1  |    7  |    6  |
----------------------------------------------------------------
|       |    -  |  :10  |    2  |    8  |   :7  |    2  |    5  |
----------------------------------------------------------------
[download]

Enjoy,

davidj

Back to Seekers of Perl Wisdom