Problems? Is your data what you think it is? PerlMonks

### Re: challanging the dictionary

 on Apr 27, 2004 at 16:05 UTC ( #348555=note: print w/replies, xml ) Need Help??

in reply to challenging the dictionary

Good luck, this is exactly the NP-complete Minimum Set Cover problem (which makes me hope it's not homework). The universe is the set of letters, the family of subsets is the dictionary of words. You want to minimize the number of words needed to use all the letters in the alphabet, in other words, the number of subsets needed such that each element in the universe is contained in at least one chosen subset.

You won't be able to do significantly better than brute force, unless an approximation algorithm would also be appropriate for your needs. But with a huge dictionary, the running time is going to be intractable.

Replies are listed 'Best First'.
Re^2: challanging the dictionary
by Limbic~Region (Chancellor) on Apr 23, 2006 at 23:11 UTC
This seems to be a relatively fast approximation algorithm (64K words in about 5 seconds):

The algorithm is quite simple. Start with the rarest letter and look for words containing that letter that have the most unique letters not found so far. Wash-Rinse-Repeat.

Cheers - L~R

Re^2: challanging the dictionary
by Limbic~Region (Chancellor) on Oct 25, 2006 at 14:28 UTC
Good luck,...

Thanks, I am sure I will need a bit of that.

But with a huge dictionary, the running time is going to be intractable.

Well fortunately for humans, alphabets are relatively small and very long words that do not repeat letters are uncommon. The number of words that you need consider from a huge (~ 65K words) is quite manageable. See How many words does it take? for an example.

Cheers - L~R

Create A New User
Node Status?
node history
Node Type: note [id://348555]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others imbibing at the Monastery: (7)
As of 2020-10-27 07:28 GMT
Sections?
Information?
Find Nodes?
Leftovers?
Voting Booth?
My favourite web site is:

Results (256 votes). Check out past polls.

Notices?