in reply to Re^2: Hangman Assistant
in thread Hangman Assistant
Let me back into the pruning opportunity by explaining where I was going with the other (now abandoned) method. In my discussion with blokhead, I explained that is should both be possible to guarantee a better than 50/50 chance of guessing correct as well as pruning the remaining candidates by more than 50%. I was setting out to do just that.
Now let's assume we were going with 'eclectic'. There were 9,638 words in my dictionary with a length of 8. The letter that appeared in the most of those words was the letter 'e' at 6,666. Note that I didn't count total occurrences of 'e' but only 1 per word. This equated to 69%. Now what I set out to do was map the different ways the letter 'e' appeared across those 6,666 words. In the word 'eclectic' it appears in positions 1 and 4 where as in 'envelope' it appears in positions 1, 4 and 8. After guessing and seeing which positions were filled in, I could even eliminate words with letter 'e' even if they shared the common position because they didn't share all positions. That last part (all positions) was the piece I hadn't considered in my previous exercise. So here is the mind blowing part. The most common set of positions across the 6,666 words with the letter 'e' still had less than 1,600 possible words. This means that by selecting the letter 'e' (69% chance of being right) I will reduce the candidate list from 9,638 to less than 1,600 (and probably a lot further). It seemed pointless then to come up with some weights for determining letter based on probability of being correct and degree to which the candidate list is reduced because the "dumb" method was still doing a superb job.
I do have one last final revision to make. I choose the letter that appears in the most candidate words but I don't break ties. Currently it is the result of sort. I plan to add total count as a secondary tie breaking condition to see if that improves results. I should post something later today.