|Problems? Is your data what you think it is?|
Re: Add-A-Gram Performanceby chipmunk (Parson)
|on Feb 05, 2002 at 06:19 UTC||Need Help??|
A word challenge! I love word challenges! :)
First, I'll second grep's suggestion to indent your code properly. The lack of indenting discouraged me from understanding your code, and therefore from providing in-depth comments on your code. :(
The most important optimization is to realize that anagrams only need to be tested once, because it's the set of letters that matters, not their order. For example, starting with 'topsy', if you drop the 'y' the next word could be 'post', or 'stop', or 'tops', etc., but you only need to try one of those words.
I implemented this by putting the words in a hash where the key is the sorted letters in a word. For each set of letters, I keep only the first word found, just to save memory. I also push each set of letters onto an array, grouped by length.
Once that's done, I can start looking for add-a-grams, beginning with the longest words. The first step is to set up a stack of attempted solutions. Then, in a loop, I pop an attempt off the stack and try to branch the last set of letters. Testing each branch is as simple and as fast as looking for the set of letters in the hash. Testing the few possible branches and looking for a matching word is much, much faster than testing all the words and looking for a matching branch. (Okay, so maybe that's the most important optimization. :)
For each match, push an attempt onto the stack. I stuck in an additional optimization here; if none of the branches work, I remove the original set of letters from the stack. That way I won't waste time trying the same dead end later.
Once an attempt gets down to three letters, I print the solution. I continue finding solutions at the same length. However, I don't try the shorter lengths after that, since the object is to find the longest possible solution.
My script ran in a few minutes and found 6640 add-a-grams with a word of 16 letters. I'm sure the code could be cleaned up, but I should be getting to bed.