in reply to Turning A Problem Upside Down

It's interesting to follow the development of a hard problem like this, to see what works and what doesn't, and more importantly to understand why.

This particular case appears to be a less strict version of the Scrabble problem (adding your letters to the existing board, instead of just forming words). I've looked at that space a bit myself.

A Word Twister solution might be
(1) create a sorted anagram lookup for the dictionary,
(2) join the anagram keys into a string with an appropriate separator,
(3) determine the letter counts in the given letters,
(4) create regex snippet strings for each letter count (e.g., "a{0,3}"),
(5) permute the regex snippets into larger regex strings,
(6) match the anagram key string against the larger regex strings,
(7) print the resulting words

Maybe something like this:

```eval '(exit \$?0)' && eval 'exec perl -w -S \$0 \${1+"\$@"}'
&& eval 'exec perl -w -S \$0 \$argv:q'
if 0;

# The above invocation finds perl in the path, wherever it may be

# find words in dictionary containing only letters given
# usage: \$0 given_letters word_list_file_name[s]

use strict;
use warnings;

our \$min_word_length = 3;
our \$given = shift;
our \$given_length = length( \$given );

our @factorial;
\$factorial[0] = 1;

# read in dictionary, removing invalid words, and words with letters n
+ot in the given list
our @word_list;
while (<>)
{
chomp;
next unless /^[a-z]{\$min_word_length,\$given_length}\$/;
my \$temp;
(\$temp = \$_) =~ s/[\$given]//g;
next if length(\$temp);
push @word_list, \$_;
}
warn "\t", scalar @word_list, " words in dictionary (after filtering)\
+n";

warn "Creating dictionary anagrams\n";
our %word_list_sorted_anagrams;
for my \$word (@word_list)
{
my \$key = join('', sort split '', \$word);
# some anagrams will not be unique
push @{\$word_list_sorted_anagrams{\$key}}, \$word;
}
# match this against a regex later
my \$word_list_sorted_anagrams_key_string = join ',', keys %word_list_s
+orted_anagrams;
warn "\t", scalar keys %word_list_sorted_anagrams, " anagram keys\n";

warn "Creating given regexes\n";
# convert given to regex strings of the form "a{0,3}b{0,2}..."
my @given = split '', \$given;
my %given_counts;
for my \$g (@given)
{
\$given_counts{\$g}++;
}
my @given_regexes;
for my \$g (keys %given_counts)
{
push @given_regexes, sprintf "%s{0,%d}", \$g,\$given_counts{\$g},\$giv
+en_counts{\$g};
}

# permute the regex strings
warn "Permuting given regexes\n";
my %regex_permutations;
my \$n = 0;
for my \$n (0..factorial(scalar @given_regexes)-1)
{
\$regex_permutations{permutation_n(\$n,@given_regexes)} = 1;
\$n++;
}
warn "\t", scalar keys %regex_permutations, " regex permutations\n";

# find all matching anagram keys
warn "Matching anagram keys\n";
# number of permutations can be large, use a loop instead of a monster
+ regex
my @key_match_results;
my \$counter; # for progress indicator, this loop can take a while!
for my \$regex_permutation (keys %regex_permutations)
{
my @results = grep {defined(\$_) and length(\$_)} \$word_list_sorted_
+anagrams_key_string =~ m/\b\$regex_permutation\b/g;
push @key_match_results, grep {defined(\$_) and length(\$_)} @result
+s;
\$counter++;
#    print STDERR "." unless \$counter % 100; # progress indicator for
+the impatient
}
warn "\n\t", scalar @key_match_results, " matching anagram keys\n";

# lookup words from keys
warn "Looking up matching words\n";
my @words_matched;
for my \$k (@key_match_results)
{
push @words_matched, @{\$word_list_sorted_anagrams{\$k}};
}

print "@words_matched\n";
warn "\t", scalar @words_matched, " words matched\n";

exit;

###########################################3
# Find and return the \$n'th permutation
# of the remaining arguments in some canonical order
# (modified from QOTW solution)
sub permutation_n
{
my \$n = shift;
my \$result = '';
while (@_)
{
(\$n, my \$r) = (int(\$n/@_), \$n % @_);
\$result .= splice @_, \$r, 1;
}
return \$result;
}

###########################
# we might do this a lot, so cache the results
sub factorial
{
my \$n = shift;

# if we already know it, return it
return \$factorial[\$n] if defined \$factorial[\$n];

# else compute it from the largest known result
my \$result = \$factorial[\$#factorial];
for my \$k ( \$#factorial+1..\$n )
{
\$result *= \$k;
}
return \$result;
}

Using the http://perl.plover.com/qotw/words/Web2.gz, and posterboy as the given:

```> time word_twister.pl posterboy web2* >! posterboy.txt
729 words in dictionary (after filtering)
Creating dictionary anagrams
512 anagram keys
Creating given regexes
Permuting given regexes
40320 regex permutations
Matching anagram keys
460888 matching anagram keys
Looking up matching words
763232 words matched
137.060u 0.290s 2:20.51 97.7%   0+0k 0+0io 2871pf+0w

Now this is pretty slow at filtering the dictionary against the given, and even slower for longer given strings. Improvements welcome.

For your particular case, you might want it to loop, asking for a new given, and not reading in the dictionary every time. (But that doesn't save much work, as the dictionary filtering against the given is expensive the way I've done it here.)

-QM
--
Quantum Mechanics: The dreams stuff is made of

Comment on Re: Turning A Problem Upside Down
Re^2: Turning A Problem Upside Down
by QM (Priest) on Aug 24, 2009 at 21:34 UTC

I realized I didn't need the permutation either, so the outline becomes

A Word Twister solution might be
(1) create a sorted anagram lookup for the dictionary,
(2) join the anagram keys into a string with an appropriate separator,
(3) determine the letter counts in the given letters,
(4) create regex snippet strings for each letter count (e.g., "a{0,3}"),
(5) match the given regex string against the monster dictionary anagram string,
(6) lookup the words from the matched anagrams,
(6) print the resulting words

I've also been toying with this idea more, and realized the previous result was bogus. So I've fixed that and cleaned it up:

-QM
--
Quantum Mechanics: The dreams stuff is made of