Beefy Boxes and Bandwidth Generously Provided by pair Networks
Pathologically Eclectic Rubbish Lister
 
PerlMonks  

Word Unscrambler

by jonnybe (Scribe)
on Dec 17, 2005 at 01:50 UTC ( #517419=snippet: print w/ replies, xml ) Need Help??

Description: While my GF was busy playing an online word scramble game, I decided to automate the process. I managed to get the initial script working in a few minutes then started to play with optimizing it. I managed to get a speed improvement of a factor of 6. Anyone have any further suggestions? For instance, is the regexp the right way to go? The dictionary used was a master dump from aspell for anyone who actually tries it out; one word per line. The letters to be unscrambled should be provided as space seperated command line arguments. (Stating the obvious I know!)
#! /usr/bin/perl -w

use strict;

my ( $word, $tmpword );

open DICT, 'wordlist' or die "Cannot open dictionary: $!";

while ( $word = <DICT> ) {
    next if ( length $word  > $#ARGV + 2 );
    chop $word;
    $tmpword = $word;
    foreach my $i ( @ARGV ) {
        $tmpword =~ s/$i//;
    }
    unless ( $tmpword ) {
        print "$word\n";
    }
}

Comment on Word Unscrambler
Download Code
Re: Word Unscrambler
by Util (Priest) on Dec 17, 2005 at 03:44 UTC

    If speed is your concern, I would recommend using a pre-built regexp for fast filtering of the candidate words, and a slower exact test against a canonical (sorted letters) form.

    Tested, working code:

    use strict; use warnings; my $dict = '/usr/share/dict/words'; # Do not space-separate the word in this version. my $scrambled_word = shift or die "Must specify a word\n"; my $scrambled_length = length $scrambled_word; my $scrambled_sorted = join '', sort split '', $scrambled_word; my $pattern = qr{ \A (?: [$scrambled_word]{$scrambled_length} ) \n \z }x; open DICT, '<', $dict or die "Cannot open '$dict': $!"; while (<DICT>) { next unless /$pattern/o; chomp; my $sorted = join '', sort split '', $_; next unless $sorted eq $scrambled_sorted; print "$_\n"; } close DICT or warn;

Re: Word Unscrambler
by hossman (Prior) on Dec 17, 2005 at 08:49 UTC

    I did something similar a few years back. I found that the biggest speed boost came from partitioning the dictionary in advance - making a sepearte file for each word length (ie: 1.letter.words, 2.letter.words, 3.letter.words, etc...) so you don't have to loop over words that obviously don't match.

    It's obviously not worth it if you're just going to run it once, but if you this is somehting you plan on doing again and again...

    Another approach I've had good luck with on a different type of word problem was to divide the dictionary up based on the first character in each word. That wouldn't help in your case, but a varient would...

    Assuming your dictionary looks like this...

    at
    bat
    cat
    car
    tab
    

    ...reformat it so it looks like this...

    abt    bat
    abt    tab
    acr    car
    act    cat
    at     at
    

    ...then when you get a scrabbled word, sort the letters, and look it up (straight substring match, no regex needed) to find all the matches. To decrease the search space, you can partition the dictionary by first letter of the "word" (first letter when the letters are sorted that is) or by the length of the words, or by both.

      uscramble, lateptrincne
Re: Word Unscrambler
by TedPride (Priest) on Dec 17, 2005 at 15:07 UTC
    Heh, I wrote something like this for the Neopets word scrambler game, just for the heck of it. I'm on a quest to see how many Neopets games I can achieve Grand Master on using Perl scripts. The interesting thing is that you have to produce all words that can be made from the letters, so it's not as simple as just sorting the letters in order and finding a match. Instead, I make sure that the letter counts for the dictionary word are equal to or smaller than the letter counts for the letter set (plus blanks), then sort the result set by size and alphabetically. I get my results quite fast (a fraction of a second usually), and they'd probably be faster if I went to the bother of eliminating the words larger than 6 letters from the dictionary, since the Neopets game always gives you 6. Also, my algorithm is set up to support blanks, which isn't necessary either unless you want to use this for Scrabble.
    use strict; use warnings; my $letters = 'jboiac'; my ($lcount, $blanks, %lhash, $handle, @matches); $lcount = length($letters); $blanks = $letters =~ s/([^a-z])//g; $lhash{$_}++ for split //, $letters; open ($handle, 'dict1.dat'); while (<$handle>) { chomp; push @matches, $_ if scrabble($_); } close ($handle); print join "\n", sort { length($b) <=> length($a) || $a cmp $b } @matc +hes; sub scrabble { return 0 if length($_[0]) > $lcount; my ($nf, %wlhash); $wlhash{$_}++ for split //, $_[0]; for (keys %wlhash) { no warnings; return 0 if $lhash{$_} < $wlhash{$_} && ($nf += $wlhash{$_} - $lhash{$_}) > $blanks; } return 1; }
    Returns:
    jacobi ciao abc bio boa cab cob jab jib job
    (abc wouldn't be allowed in Scrabble, but the Neopets game has some odd words in it, including names of Neopets. I had to add these manually as I went along using a second script)
Re: Word Unscrambler
by Posthumous (Scribe) on May 03, 2006 at 15:58 UTC

    This is a late comment, but I hope not out of place.

    My daughter and I were working on a Jumble puzzle with a word we just couldn't crack. So, of course, I decided to throw the Perl wrench at it. As I thought about the algorithm, regular expressions just seemed like a natural choice.

    I have to say this solution amazed me (I admit it, I'm not an experienced or professionally educatued programmer). I mean look at: 12 lines! And the solution -- how clever! Rather than looking for a match, it eliminates non-matches. I've learned something here.

    OK, so I needed to solve a jumble, not find all possibilities. Two quick changes to one line, and I'm off:

    next unless ( length $word == $#ARGV + 2 );

    Great post, great solution. Thanks jonnybe

Back to Snippets Section

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: snippet [id://517419]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others musing on the Monastery: (3)
As of 2014-08-30 23:57 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    The best computer themed movie is:











    Results (294 votes), past polls