Beefy Boxes and Bandwidth Generously Provided by pair Networks
good chemistry is complicated,
and a little bit messy -LW
 
PerlMonks  

Re^3: Hangman Assistant

by Limbic~Region (Chancellor)
on Jul 14, 2009 at 13:30 UTC ( #779917=note: print w/ replies, xml ) Need Help??


in reply to Re^2: Hangman Assistant
in thread Hangman Assistant

Lawliet,
Let me back into the pruning opportunity by explaining where I was going with the other (now abandoned) method. In my discussion with blokhead, I explained that is should both be possible to guarantee a better than 50/50 chance of guessing correct as well as pruning the remaining candidates by more than 50%. I was setting out to do just that.

Now let's assume we were going with 'eclectic'. There were 9,638 words in my dictionary with a length of 8. The letter that appeared in the most of those words was the letter 'e' at 6,666. Note that I didn't count total occurrences of 'e' but only 1 per word. This equated to 69%. Now what I set out to do was map the different ways the letter 'e' appeared across those 6,666 words. In the word 'eclectic' it appears in positions 1 and 4 where as in 'envelope' it appears in positions 1, 4 and 8. After guessing and seeing which positions were filled in, I could even eliminate words with letter 'e' even if they shared the common position because they didn't share all positions. That last part (all positions) was the piece I hadn't considered in my previous exercise. So here is the mind blowing part. The most common set of positions across the 6,666 words with the letter 'e' still had less than 1,600 possible words. This means that by selecting the letter 'e' (69% chance of being right) I will reduce the candidate list from 9,638 to less than 1,600 (and probably a lot further). It seemed pointless then to come up with some weights for determining letter based on probability of being correct and degree to which the candidate list is reduced because the "dumb" method was still doing a superb job.

I do have one last final revision to make. I choose the letter that appears in the most candidate words but I don't break ties. Currently it is the result of sort. I plan to add total count as a secondary tie breaking condition to see if that improves results. I should post something later today.

Cheers - L~R


Comment on Re^3: Hangman Assistant
Re^4: Hangman Assistant
by Lawliet (Curate) on Jul 14, 2009 at 15:44 UTC
    I could even eliminate words with letter 'e' even if they shared the common position because they didn't share all positions.

    Ah, that is what I have left to implement. The method I initially used is the exact same as the one you use besides the 'all positions' part, which is why yours works better.

    I don't mind occasionally having to reinvent a wheel; I don't even mind using someone's reinvented wheel occasionally. But it helps a lot if it is symmetric, contains no fewer than ten sides, and has the axle centered. I do tire of trapezoidal wheels with offset axles. --Joseph Newcomer

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://779917]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others exploiting the Monastery: (4)
As of 2014-09-19 03:10 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    How do you remember the number of days in each month?











    Results (129 votes), past polls