comment on

I originally referred to this puzzle as a "word changer", inspired by a puzzle from a daily brain teaser calendar. It was not until I had finished writing the solver and composing this post that I made a final attempt to search PM with several other phrases, including "word ladder". To my chagrin, I found a previous post (Solve Word Ladders) that not only addressed the same puzzle, but also nearly used one of the same examples - ack!! I posted this anyway because my questions still stand (see below), TMTOWTDI, and I hope that someone will find it useful. I retained the use of "word ladder" for consistency between postings.

I wrote a program to solve a word ladder puzzle, and when I finished I wondered how my solution would compare to the approach taken by other Monks. This was a 'just for fun' project, but I found it an interesting challenge. First, a definition of the puzzle:

A word ladder is composed of a known starting word, one or more unknown intermediate words, and a known ending word. The challenge of the puzzle is to change the starting word into the ending word in the specified number of steps (the intermediate words), by changing only one letter at a time. Each intermediate word must be a 'real' word. Of course, what constitutes a 'real' word can be a matter of debate, but I usually limit myself to words that can be found in a standard English dictionary. Proper names, abbreviations, and acronyms are not allowed.

For example, this puzzle changes 'page' into 'book' in 5 steps (4 intermediate words):

  page
  ____
  ____
  ____
  ____
  book
[download]

One possible solution is: page-pane-bane-bank-bonk-book

Another, more appropriate example:
let's turn a 'monk' into 'perl' in 10 steps (9 intermediate words):

Here is one solution:

monk # Start as a lowly monk, and bonk # add a few bonks on the head. bone hone # Hone your skills, cone code # and write better code. core # Live it, breathe it, into your core. cure # There is a cure pure # when your heart is pure... purl perl # it is perl.

You can get perl out of a monk using only 4 intermediate words, but it's not as fun. :)

monk-conk-cork-pork-perk-perl

My code is posted below. My questions are:

How could I improve my code (algorithm, style, efficiency)? I am particularly interested in ideas about how I can list all possible change paths in a way that scales well.
How would you solve this puzzle?
(And just for fun) what puzzles can you create?

Thanks in advance, and happy puzzling!


use strict;
use warnings;

#************************** USER SETTINGS ***************************

# The starting and ending words - all must be the same length.
# Arrays are used so multiple words can be provided (for solving the
# puzzle in pieces rather than all at once).

my @startword = qw( page );
my @endword   = qw( book );

# The number of intermediate words allowed, must be > 1.  For example,
# PAGE -> ___ -> ___ -> ___ -> ___ -> BOOK has 4 intermediate words

my $num_intermed_words = 4;


# The file containing all valid words (a dictionary).  See the
# read_known_words sub for the format of the word file

my $word_file = '2of4brif_sorted.txt';

#****************************** MAIN ********************************

my $wordlength = length( $startword[0] );

if( $wordlength != length( $endword[0] ) )
{
    die "Starting and ending words are not the same length!\n";
}

# read all valid words from the word (dictionary) file

my @all_words;

read_known_words( $word_file, $wordlength, \@all_words );

# create a hash of arrays that lists all possible intermediate words
# the starting word is level 1
# if there are 4 intermediate words the ending word is level 5

my %possible_words;

$possible_words{1} = \@startword;
$possible_words{ $num_intermed_words + 2 } = \@endword;

my $num_steps = int( $num_intermed_words / 2 );

# start at $startword and work to the midpoint

my $num_forsteps = $num_intermed_words % 2 ?
                   $num_steps + 1 : $num_steps;

foreach my $level ( 1 .. $num_forsteps )
{
    print "level $level: ";
    my $ref2backlevel = exists $possible_words{ $level - 1 } ?
                               $possible_words{ $level - 1 } : [ ];
    $possible_words{ $level + 1 } =
        find_within_one_letter( \@all_words,
                                $possible_words{$level},
                                $ref2backlevel );
}

# start at $endword and work to the midpoint

for( my $step = 1, my $level = $num_intermed_words + 2;
     $step <= $num_steps; $step++, $level-- )
{
    print "level $level: ";
    my $ref2backlevel = exists $possible_words{ $level + 1 } ?
                               $possible_words{ $level + 1 } : [ ];
    $possible_words{ $level - 1 } =
        find_within_one_letter( \@all_words,
                                $possible_words{$level},
                                $ref2backlevel );
}

@all_words = (); # clear the dictionary so the memory can be reused

# create subsets based on prior results

my %final_words;

$final_words{1} = \@startword;
$final_words{ $num_intermed_words + 2 } = \@endword;

# compare the middle two sets of possible words

print 'level ', $num_forsteps + 1, ': ';

$final_words{ $num_forsteps + 1 } =
    find_within_one_letter( $possible_words{ $num_forsteps + 1 },
                            $possible_words{ $num_forsteps + 2 },
                            [ ] );

# work backwards from the middle to the start word

for( my $level = $num_forsteps; $level > 1; $level-- )
{
    print "level $level: ";
    $final_words{$level} =
        find_within_one_letter( $possible_words{$level},
                                $final_words{ $level + 1 }, [ ] );
}

# work forwards from the middle to the end word

foreach my $level ( $num_forsteps + 2 .. $num_intermed_words + 1 )

{
    print "level $level: ";
    $final_words{$level} =
        find_within_one_letter( $possible_words{$level},
                                $final_words{ $level - 1 }, [ ] );
}

# Assemble all possible change paths.
# This is fine for most puzzles, but it does not scale well!!  Perhaps
# a blocked approach would be better (list all possible solutions from
# the starting word to the midpoint, then all possible solutions from
# the midpoint to the ending word).

my @paths = @startword;

foreach my $level ( 2 .. $num_intermed_words + 2 )
{
    my @temppaths;

    foreach my $pathoption ( @paths )
    {
        my @pathwords = split( '-', $pathoption );

        foreach my $leveloption ( @{ $final_words{$level} } )
        {
            # if this $leveloption is 1 char diff from the last word
            # in $pathoption, keep it

            my $edit_distance = calc_dist( $pathwords[-1],
                                           $leveloption );

            if( $edit_distance == 1 )
            {
                push( @temppaths,
                      join( '-', $pathoption, $leveloption ) );
            }
        }
    }

    @paths = @temppaths;
}

# Print results

my $outfile = join( '', $startword[0], '2', $endword[0],
                        '_in', $num_intermed_words, '.txt' );

open( my $out_fh, '>', $outfile )
    or die "Error opening output file:\n$!";

print $out_fh 'options for converting \'', $startword[0], '\' to \'',
              $endword[0], '\'', "\n";
print $out_fh "number of intermediate words = $num_intermed_words\n";
print $out_fh "dictionary file = $word_file\n";

foreach my $step ( sort { $a <=> $b } keys %final_words )
{
    print $out_fh "\nstep $step options:\n",
                  join( "\n", @{ $final_words{$step} } );
}

print $out_fh "\n\nall possible conversion paths:\n";

foreach my $pathoption ( @paths )
{
    print $out_fh "$pathoption\n";
}

close $out_fh;

#*************************** SUBFUNCTIONS ***************************

sub read_known_words
{
    my ( $word_file, $word_length, $ref2allwords ) = @_;

    open( my $wordfile, "<$word_file" )
        or die "Error opening $word_file\n$!";

    # find the right section and add the words to the array

    while( defined( my $line = <$wordfile> ) )
    {
        if( $line =~ m/^# LENGTH $word_length\n/ )
        {
            last;
        }
    }

    while( defined( my $line = <$wordfile> ) )
    {
        if( $line =~ m/^# LENGTH/ )
        {
            last;
        }
        else
        {
            chomp $line;
            push( @{ $ref2allwords }, $line );
        }
    }

    close $wordfile;

    if( scalar @{ $ref2allwords } == 0 )
    {
        die "No words of length $word_length found in $word_file\n";
    }
}

sub find_within_one_letter
{
    my ( $ref2dictionary, $ref2startwords, $ref2backwords ) = @_;

    print 'dictionary = ', scalar @{ $ref2dictionary },
          ', starting options = ', scalar @{ $ref2startwords };

    # make the 'backwords' list into a hash for quick lookup

    my %notin;

    foreach my $word ( @{ $ref2backwords } )
    {
        $notin{$word} = 1;
    }

    # scan through the dictionary list and put words that are within
    # 1 letter of a start word (but are not in %notin) into @results

    my @results;

    DICT_WORD:
    foreach my $dictword ( @{ $ref2dictionary } )
    {
        if( exists( $notin{$dictword} ) )
        {
            next DICT_WORD;
        }

        foreach my $startword ( @{ $ref2startwords } )
        {
            my $edit_distance = calc_dist( $dictword, $startword );

            if( $edit_distance == 1 )
            {
                push( @results, $dictword );
                next DICT_WORD;
            }
        }
    }

    print ', found = ', scalar @results, "\n";

    return \@results;
}

sub calc_dist
{
    my ( $word1, $word2 ) = @_;

    my $dist = 0;

    for( my $i = 0; $i <= length( $word1 ) - 1; $i++ )
    {
        unless( substr( $word1, $i, 1 ) eq substr( $word2, $i, 1 ) )
        {
            $dist++;
        }
    }

    return $dist;
}
[download]

In reply to Solving word ladder puzzles by bobf

Are you posting in the right place? Check out Where do I post X? to know for sure.
Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
<code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
Want more info? How to link or How to display code and escape characters are good places to start.


We don't bite newbies here... much
	PerlMonks