Beefy Boxes and Bandwidth Generously Provided by pair Networks
Don't ask to ask, just ask
 
PerlMonks  

Comment on

( #3333=superdoc: print w/ replies, xml ) Need Help??

It's interesting to follow the development of a hard problem like this, to see what works and what doesn't, and more importantly to understand why.

This particular case appears to be a less strict version of the Scrabble problem (adding your letters to the existing board, instead of just forming words). I've looked at that space a bit myself.

A Word Twister solution might be
(1) create a sorted anagram lookup for the dictionary,
(2) join the anagram keys into a string with an appropriate separator,
(3) determine the letter counts in the given letters,
(4) create regex snippet strings for each letter count (e.g., "a{0,3}"),
(5) permute the regex snippets into larger regex strings,
(6) match the anagram key string against the larger regex strings,
(7) print the resulting words

Maybe something like this:

eval '(exit $?0)' && eval 'exec perl -w -S $0 ${1+"$@"}' && eval 'exec perl -w -S $0 $argv:q' if 0; # The above invocation finds perl in the path, wherever it may be # find words in dictionary containing only letters given # usage: $0 given_letters word_list_file_name[s] use strict; use warnings; our $min_word_length = 3; our $given = shift; our $given_length = length( $given ); our @factorial; $factorial[0] = 1; # read in dictionary, removing invalid words, and words with letters n +ot in the given list warn "Reading in dictionary\n"; our @word_list; while (<>) { chomp; next unless /^[a-z]{$min_word_length,$given_length}$/; my $temp; ($temp = $_) =~ s/[$given]//g; next if length($temp); push @word_list, $_; } warn "\t", scalar @word_list, " words in dictionary (after filtering)\ +n"; warn "Creating dictionary anagrams\n"; our %word_list_sorted_anagrams; for my $word (@word_list) { my $key = join('', sort split '', $word); # some anagrams will not be unique push @{$word_list_sorted_anagrams{$key}}, $word; } # match this against a regex later my $word_list_sorted_anagrams_key_string = join ',', keys %word_list_s +orted_anagrams; warn "\t", scalar keys %word_list_sorted_anagrams, " anagram keys\n"; warn "Creating given regexes\n"; # convert given to regex strings of the form "a{0,3}b{0,2}..." my @given = split '', $given; my %given_counts; for my $g (@given) { $given_counts{$g}++; } my @given_regexes; for my $g (keys %given_counts) { push @given_regexes, sprintf "%s{0,%d}", $g,$given_counts{$g},$giv +en_counts{$g}; } # permute the regex strings warn "Permuting given regexes\n"; my %regex_permutations; my $n = 0; for my $n (0..factorial(scalar @given_regexes)-1) { $regex_permutations{permutation_n($n,@given_regexes)} = 1; $n++; } warn "\t", scalar keys %regex_permutations, " regex permutations\n"; # find all matching anagram keys warn "Matching anagram keys\n"; # number of permutations can be large, use a loop instead of a monster + regex my @key_match_results; my $counter; # for progress indicator, this loop can take a while! for my $regex_permutation (keys %regex_permutations) { my @results = grep {defined($_) and length($_)} $word_list_sorted_ +anagrams_key_string =~ m/\b$regex_permutation\b/g; push @key_match_results, grep {defined($_) and length($_)} @result +s; $counter++; # print STDERR "." unless $counter % 100; # progress indicator for +the impatient } warn "\n\t", scalar @key_match_results, " matching anagram keys\n"; # lookup words from keys warn "Looking up matching words\n"; my @words_matched; for my $k (@key_match_results) { push @words_matched, @{$word_list_sorted_anagrams{$k}}; } print "@words_matched\n"; warn "\t", scalar @words_matched, " words matched\n"; exit; ###########################################3 # Find and return the $n'th permutation # of the remaining arguments in some canonical order # (modified from QOTW solution) sub permutation_n { my $n = shift; my $result = ''; while (@_) { ($n, my $r) = (int($n/@_), $n % @_); $result .= splice @_, $r, 1; } return $result; } ########################### # we might do this a lot, so cache the results sub factorial { my $n = shift; # if we already know it, return it return $factorial[$n] if defined $factorial[$n]; # else compute it from the largest known result my $result = $factorial[$#factorial]; for my $k ( $#factorial+1..$n ) { $result *= $k; } return $result; }

Using the http://perl.plover.com/qotw/words/Web2.gz, and posterboy as the given:

> time word_twister.pl posterboy web2* >! posterboy.txt Reading in dictionary 729 words in dictionary (after filtering) Creating dictionary anagrams 512 anagram keys Creating given regexes Permuting given regexes 40320 regex permutations Matching anagram keys 460888 matching anagram keys Looking up matching words 763232 words matched 137.060u 0.290s 2:20.51 97.7% 0+0k 0+0io 2871pf+0w

Now this is pretty slow at filtering the dictionary against the given, and even slower for longer given strings. Improvements welcome.

For your particular case, you might want it to loop, asking for a new given, and not reading in the dictionary every time. (But that doesn't save much work, as the dictionary filtering against the given is expensive the way I've done it here.)

-QM
--
Quantum Mechanics: The dreams stuff is made of


In reply to Re: Turning A Problem Upside Down by QM
in thread Turning A Problem Upside Down by Limbic~Region

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • Outside of code tags, you may need to use entities for some characters:
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.
  • Log In?
    Username:
    Password:

    What's my password?
    Create A New User
    Chatterbox?
    and the web crawler heard nothing...

    How do I use this? | Other CB clients
    Other Users?
    Others making s'mores by the fire in the courtyard of the Monastery: (18)
    As of 2014-07-25 16:45 GMT
    Sections?
    Information?
    Find Nodes?
    Leftovers?
      Voting Booth?

      My favorite superfluous repetitious redundant duplicative phrase is:









      Results (174 votes), past polls