http://www.perlmonks.org?node_id=779788


in reply to Re: Hangman Assistant
in thread Hangman Assistant

Oohh, definitely interesting results! It seems you are still using the 'find most common letter' method of finding the next best letter though. Is the only thing you changed the way you prune the wordlist?

A few tweaks here and there, the insertion of perl capabilities into a contact lens, and I think we may have a viable cheating method at our disposal!

I don't mind occasionally having to reinvent a wheel; I don't even mind using someone's reinvented wheel occasionally. But it helps a lot if it is symmetric, contains no fewer than ten sides, and has the axle centered. I do tire of trapezoidal wheels with offset axles. --Joseph Newcomer

Replies are listed 'Best First'.
Re^3: Hangman Assistant
by Limbic~Region (Chancellor) on Jul 14, 2009 at 13:30 UTC
    Lawliet,
    Let me back into the pruning opportunity by explaining where I was going with the other (now abandoned) method. In my discussion with blokhead, I explained that is should both be possible to guarantee a better than 50/50 chance of guessing correct as well as pruning the remaining candidates by more than 50%. I was setting out to do just that.

    Now let's assume we were going with 'eclectic'. There were 9,638 words in my dictionary with a length of 8. The letter that appeared in the most of those words was the letter 'e' at 6,666. Note that I didn't count total occurrences of 'e' but only 1 per word. This equated to 69%. Now what I set out to do was map the different ways the letter 'e' appeared across those 6,666 words. In the word 'eclectic' it appears in positions 1 and 4 where as in 'envelope' it appears in positions 1, 4 and 8. After guessing and seeing which positions were filled in, I could even eliminate words with letter 'e' even if they shared the common position because they didn't share all positions. That last part (all positions) was the piece I hadn't considered in my previous exercise. So here is the mind blowing part. The most common set of positions across the 6,666 words with the letter 'e' still had less than 1,600 possible words. This means that by selecting the letter 'e' (69% chance of being right) I will reduce the candidate list from 9,638 to less than 1,600 (and probably a lot further). It seemed pointless then to come up with some weights for determining letter based on probability of being correct and degree to which the candidate list is reduced because the "dumb" method was still doing a superb job.

    I do have one last final revision to make. I choose the letter that appears in the most candidate words but I don't break ties. Currently it is the result of sort. I plan to add total count as a secondary tie breaking condition to see if that improves results. I should post something later today.

    Cheers - L~R

      I could even eliminate words with letter 'e' even if they shared the common position because they didn't share all positions.

      Ah, that is what I have left to implement. The method I initially used is the exact same as the one you use besides the 'all positions' part, which is why yours works better.

      I don't mind occasionally having to reinvent a wheel; I don't even mind using someone's reinvented wheel occasionally. But it helps a lot if it is symmetric, contains no fewer than ten sides, and has the axle centered. I do tire of trapezoidal wheels with offset axles. --Joseph Newcomer