Beefy Boxes and Bandwidth Generously Provided by pair Networks RobOMonk
Welcome to the Monastery
 
PerlMonks  

Re: Challenge: 8 Letters, Most Words (Update2:Then again :)

by BrowserUk (Pope)
on Oct 04, 2013 at 16:05 UTC ( #1056900=note: print w/ replies, xml ) Need Help??


in reply to Challenge: 8 Letters, Most Words

Update2: Second guessing myself. Update: Amongst other possible errors, this rejects words that have more than 8 letters; whereas it should only reject words with more than 8 unique letters!

I get:

C:\test>1056884 words.txt a e i n o r s t: 458

Just brute force (takes 3 or 4 seconds on my 179000 word dictionary):

#! perl -slw use strict; my %stats; while( <> ) { chomp; next if length > 8; my $word = $_; push @{ $stats{ $_ } }, $word for split '', $word; } my @freq = sort{ @{ $stats{ $b } } <=> @{ $stats{ $a } }; } keys %stats; my %letters; ++$letters{ $_ } for @freq[ 0 .. 7 ]; my %uniq; for my $l ( keys %letters ) { WORD: for my $word ( @{ $stats{ $l } } ) { my %letters = %letters; for my $c ( split '', $word ) { unless( exists $letters{ $c } and --$letters{ $c } >= 0 ) + { #print( "rejected: $word" ); delete $uniq{ $word }; next WORD ; } } ++$uniq{ $word } ; } } print "@{[ sort keys %letters ]}: ", scalar keys %uniq;

With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.
div class= @{ $stats{ $a } }; } keys %stats; my %letters; ++$letters{ $_ } for @freq


Comment on Re: Challenge: 8 Letters, Most Words (Update2:Then again :)
Select or Download Code
Re^2: Challenge: 8 Letters, Most Words (Update2:Then again :)
by Limbic~Region (Chancellor) on Oct 04, 2013 at 19:18 UTC
    BrowserUk,
    This certainly doesn't look like brute force to me.

    First you get the words each letter appears in (regardless of how many times it appears). Then, you consider the top 8 letters based on how many words contain the letter but only allow each letter to appear once.

    Your solution would probably be a lot faster if you also threw out any word from the dictionary that had repeated letter since I can't see how you would ever consider them.

    Cheers - L~R

Re^2: Challenge: 8 Letters, Most Words (Update2:Then again :)
by McA (Deacon) on Oct 04, 2013 at 19:47 UTC

    Hi BrowserUK,

    while I'm looking at the solutions presented here, I tried to understand your solution knowing that you have often a different and interesting view of looking at the problem. I did a litte test case dict:

    ffffffff fffffffa afffffff bfffffff fffffffb ffffbfff

    With this dict the letters to choose to get the most matching words is 'bfffffff'. You algorithm prints something different. What have I missed?

    Best regards
    McA

      you have often a different and interesting view of looking at the problem.

      How many dictionary words contain 7 repeats of the same letter? :)


      With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.

        I cite from the description:

        ...link to dictionary you used...

        :-))

        BrowserUk,
        You don't need 7 repeats to exhibit the problem:
        aabc aacb aabc aabc abca eefg efge eegf eegf efeg aabcxy eefglm
        Your code produces: a b c e f g m x: 0

        Cheers - L~R

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://1056900]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others musing on the Monastery: (9)
As of 2014-04-19 19:58 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    April first is:







    Results (483 votes), past polls