Beefy Boxes and Bandwidth Generously Provided by pair Networks
Welcome to the Monastery

Re^6: Challenge: 8 Letters, Most Words

by LanX (Chancellor)
on Oct 05, 2013 at 13:19 UTC ( #1057028=note: print w/replies, xml ) Need Help??

in reply to Re^5: Challenge: 8 Letters, Most Words
in thread Challenge: 8 Letters, Most Words

> In other words, choose all combinations of any 8 chars out of the following string: "aaaabbbcccddddeeeeffffgggghhhiiijjkkkllllmmmnnnnoooopppqrrrrssssstttuuuuvvwwwxxyyyzzzz"

The point is to handle duplications, if all letters are unique the answer is simply (26 over 8)

DB<288> sub fac {my $x=1; $x*=$_ for 2..$_[0]; $x} DB<289> sub binom { my ($n,$k)=@_; fac($n)/(fac($n-$k)*fac($k)) } DB<290> binom 26,8 => 1562275

Reasoning is simple, it calculates all binary vectors of length 26 with exactly 8 1-bits.

But with duplications its more complicated, e.g. 4 out of "aabcd" is not (5 over 4)=5

a bcd abcd aa cd aab d aabc

cause the first 2 solutions are identical.

Generating all combinations and filtering the unique once is normally not very clever, cause the overhead can be enormous.

DB<292> length "aaaabbbcccddddeeeeffffgggghhhiiijjkkkllllmmmnnnnoooo +pppqrrrrssssstttuuuuvvwwwxxyyyzzzz" => 86 DB<293> binom 86,8 => "53060358690"

And I'm stuck finding a formula which calculates all unique solutions, but generating is easier, just don't allow bitvectors with "gaps" between identical letters:


"aaaabbbc..." # pattern "1000100...." # ok => ab... "1100100...." # ok => aab... "1001100.... # not ok => aab...

I will update this post with a loop generating all possibilities soon.


indeed L~R's number of 12461993 possibilities is correct

The following (non-optimized) code took 3 minutes to calculate them:

use strict; use warnings; use Data::Dump qw/pp/; my %max=( a => 4, b => 3, c => 3, d => 4, e => 4, f => 4, g => 4, h => 3, i => 3, j => 2, k => 3, l => 4, m => 3, n => 4, o => 4, p => 3, q => 1, r => 4, s => 5, t => 3, u => 4, v => 2, w => 3, x => 2, y => 3, z => 4, ); my $maxword=8; #%max=(a=>2,b=>1,c=>1,d=>1); #$maxword=4; my @keysort=sort keys %max; my $nsolutions=0; sub generate { my ($idx,$word)=@_; #pp \@_; return if $idx >= @keysort; my $newword; my $key= $keysort[$idx]; for my $n (0..$max{$key}){ # print "\t"x$idx ,"$n x $key\n"; my $newword = $word.($key x $n); if ($maxword== length $newword) { # print "-> $newword\n"; $nsolutions++; last; } generate($idx+1, $newword ) } } generate(0,""); print $nsolutions;

Cheers Rolf

( addicted to the Perl Programming Language)

Replies are listed 'Best First'.
Re^7: Challenge: 8 Letters, Most Words
by Limbic~Region (Chancellor) on Oct 05, 2013 at 21:08 UTC
    The C code I wrote to calculate it is below. It finished nearly instantly. I was trying to determine how to write the rest of the code (could it fit in memory). It also ended up being part of my final solution since I realized it was unnecessary to keep them in memory using a high water mark algorithm.

    Cheers - L~R

      isn't the goal to find a clever Perl solution for the whole problem which does it within less than an hour? :)

      I have two good ideas but no time to hack them. :(

      Cheers Rolf

      ( addicted to the Perl Programming Language)

        Absolutely - but I am not that smart. I was hoping someone would provide a solution that was guaranteed to be correct that didn't require exhaustive searching. So far, all of the heuristic methods have flaws. That isn't to say I am not smart enough to come up with a heuristic solution but I needed to know for certain.

        Cheers - L~R

Re^7: Challenge: 8 Letters, Most Words
by Limbic~Region (Chancellor) on Oct 07, 2013 at 17:52 UTC
    And I'm stuck finding a formula which calculates all unique solutions

    If a letter was allowed to repeat up to the maximum 8 times, the formula is:

    (n + (k - 1))! -------------- k! (n - 1)! 33! -------- = 13_884_156 8! * 25!

    I am not sure how to get from 13_884_156 to 12_461_993 strictly through calculation but it seems like it should be possible.

    Cheers - L~R

      Hei L~R,

      I'm sure there is a formula, but I doubt it's a simple formula. (rather something with 26 subterms)

      I could dig into it for some time, but firing our script which just counts all combinations seems fine for me! :)

      So you are still obsessed with this problem? :)

      Cheers Rolf

      ( addicted to the Perl Programming Language)

        So you are still obsessed with this problem? :)

        I believe that if a formula exists, it would be <total> - <calculate for a> - <calculate for b> ... <calculate for z>.

        I am not really obsessed with the problem. I just like to learn and this was a fun distraction. Most of the heuristic solutions in this thread are ones I had considered and abandoned because I could think of a trivial case where it failed. It is also why I resorted to an exhaustive search. Not because a heuristic solution wouldn't be good enough or that it may be guaranteed to give the correct solution on any real dictionary but because I wanted to see if I could do it in a reasonable (less than 24 hours) amount of time. I was completely shocked that it only took 12 minutes on my home machine.

        Cheers - L~R

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://1057028]
and all is quiet...

How do I use this? | Other CB clients
Other Users?
Others lurking in the Monastery: (7)
As of 2017-01-21 16:03 GMT
Find Nodes?
    Voting Booth?
    Do you watch meteor showers?

    Results (185 votes). Check out past polls.