Beefy Boxes and Bandwidth Generously Provided by pair Networks
Pathologically Eclectic Rubbish Lister

Re^2: Challenge: 8 Letters, Most Words

by Limbic~Region (Chancellor)
on Oct 05, 2013 at 12:45 UTC ( #1057025=note: print w/replies, xml ) Need Help??

in reply to Re: Challenge: 8 Letters, Most Words
in thread Challenge: 8 Letters, Most Words

I considered a bucket approach as well - but not quite the way you did it. What I was going to do was bucket all words of the same length and then at the end roll up 2 letter words into 3 letter words, 3 letter words into 4 letter words, etc. I realized this won't guarantee the best solution because of what I have explained elsewhere in this thread (taking some from one group and some from another can lead to the best solution)

I ran your code against 2of12inf.txt (actually, the file I had already reduced). On my machine it took 2 hours and 3 minutes. Here is the output:

Working with 40933 words Letters: primates Number of words: 316
The correct answer has 346 words.

Cheers - L~R

Replies are listed 'Best First'.
Re^3: Challenge: 8 Letters, Most Words
by aaron_baugher (Curate) on Oct 05, 2013 at 13:45 UTC

    Yes, I had it dump the complete array to a file the last time, and the winning set "painters" only gets 292 words from my script. I had a feeling it could miss some somehow, but I can't quite figure out how yet.

    Update: I see the problem now. Let's say "painters" is the 100th word, and "at" is word #2 and "not" is word #3. "not" gets added to bucket #2 along with "at" because it can fit, but now "painters" can't go into that bucket. And since "at" has already been placed in bucket #2 and possibly bucket #1, it can't be placed in bucket #100 with "painters" or in any of the buckets in between that "painters" might be in.

    So to make my approach work, I guess once all the words have been placed in at least one bucket, I would have to go back through them again, retrying them against each bucket past the one they started in. And even then I'm not sure you'd be guaranteed to get a bucket with the best possible combination.

    I had a nagging feeling there was a problem like that, but I needed a night's sleep to see it.

    Aaron B.
    Available for small or large Perl jobs; see my home node.

      I think all you need to do is to sort the words descending by length before you run through your loop. See my variation below which is very similar to yours Re^2: Challenge: 8 Letters, Most Words.

      The reason is that a longer word will never be added to the stack of a shorter word, so you do not need to check that.

        That should help, but even then it seems like it would be possible for some matches to be ignored with my method. For instance, say the word 'post' comes along, and next the word 'dime.' Since 'dime' can fit into the bucket started by 'post,' that bucket now holds 'postdime.' Later the word 'cot' comes along, and now it can't be added to that bucket, so there will be no bucket that ever contains 'post' and 'cot.'

        As a test, I tried this short list of words: post dime coat tear. The three words "post coat tear" can fit into 8 letters: aceoprst. Mine failed to find that, as expected, because the 'post' bucket had already been filled with 'dime,' so 'coat' and 'tear' never got a chance at it. If I move 'dime' to the end, then it works, but there's no way to anticipate that. So my method fails, even if I sort the words by length first.

        Aaron B.
        Available for small or large Perl jobs; see my home node.

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://1057025]
and all is quiet...

How do I use this? | Other CB clients
Other Users?
Others cooling their heels in the Monastery: (6)
As of 2018-06-18 19:48 GMT
Find Nodes?
    Voting Booth?
    Should cpanminus be part of the standard Perl release?

    Results (110 votes). Check out past polls.