Beefy Boxes and Bandwidth Generously Provided by pair Networks
Do you know where your variables are?
 
PerlMonks  

Re^6: Challenge: 8 Letters, Most Words

by LanX (Canon)
on Oct 05, 2013 at 13:19 UTC ( #1057028=note: print w/ replies, xml ) Need Help??


in reply to Re^5: Challenge: 8 Letters, Most Words
in thread Challenge: 8 Letters, Most Words

> In other words, choose all combinations of any 8 chars out of the following string: "aaaabbbcccddddeeeeffffgggghhhiiijjkkkllllmmmnnnnoooopppqrrrrssssstttuuuuvvwwwxxyyyzzzz"

The point is to handle duplications, if all letters are unique the answer is simply (26 over 8)

DB<288> sub fac {my $x=1; $x*=$_ for 2..$_[0]; $x} DB<289> sub binom { my ($n,$k)=@_; fac($n)/(fac($n-$k)*fac($k)) } DB<290> binom 26,8 => 1562275

Reasoning is simple, it calculates all binary vectors of length 26 with exactly 8 1-bits.

But with duplications its more complicated, e.g. 4 out of "aabcd" is not (5 over 4)=5

a bcd abcd aa cd aab d aabc

cause the first 2 solutions are identical.

Generating all combinations and filtering the unique once is normally not very clever, cause the overhead can be enormous.

DB<292> length "aaaabbbcccddddeeeeffffgggghhhiiijjkkkllllmmmnnnnoooo +pppqrrrrssssstttuuuuvvwwwxxyyyzzzz" => 86 DB<293> binom 86,8 => "53060358690"

And I'm stuck finding a formula which calculates all unique solutions, but generating is easier, just don't allow bitvectors with "gaps" between identical letters:

so

"aaaabbbc..." # pattern "1000100...." # ok => ab... "1100100...." # ok => aab... "1001100.... # not ok => aab...

I will update this post with a loop generating all possibilities soon.

update

indeed L~R's number of 12461993 possibilities is correct

The following (non-optimized) code took 3 minutes to calculate them:

use strict; use warnings; use Data::Dump qw/pp/; my %max=( a => 4, b => 3, c => 3, d => 4, e => 4, f => 4, g => 4, h => 3, i => 3, j => 2, k => 3, l => 4, m => 3, n => 4, o => 4, p => 3, q => 1, r => 4, s => 5, t => 3, u => 4, v => 2, w => 3, x => 2, y => 3, z => 4, ); my $maxword=8; #%max=(a=>2,b=>1,c=>1,d=>1); #$maxword=4; my @keysort=sort keys %max; my $nsolutions=0; sub generate { my ($idx,$word)=@_; #pp \@_; return if $idx >= @keysort; my $newword; my $key= $keysort[$idx]; for my $n (0..$max{$key}){ # print "\t"x$idx ,"$n x $key\n"; my $newword = $word.($key x $n); if ($maxword== length $newword) { # print "-> $newword\n"; $nsolutions++; last; } generate($idx+1, $newword ) } } generate(0,""); print $nsolutions;

Cheers Rolf

( addicted to the Perl Programming Language)


Comment on Re^6: Challenge: 8 Letters, Most Words
Select or Download Code
Replies are listed 'Best First'.
Re^7: Challenge: 8 Letters, Most Words
by Limbic~Region (Chancellor) on Oct 05, 2013 at 21:08 UTC
    LanX,
    The C code I wrote to calculate it is below. It finished nearly instantly. I was trying to determine how to write the rest of the code (could it fit in memory). It also ended up being part of my final solution since I realized it was unnecessary to keep them in memory using a high water mark algorithm.

    Cheers - L~R

      isn't the goal to find a clever Perl solution for the whole problem which does it within less than an hour? :)

      I have two good ideas but no time to hack them. :(

      Cheers Rolf

      ( addicted to the Perl Programming Language)

        LanX,
        Absolutely - but I am not that smart. I was hoping someone would provide a solution that was guaranteed to be correct that didn't require exhaustive searching. So far, all of the heuristic methods have flaws. That isn't to say I am not smart enough to come up with a heuristic solution but I needed to know for certain.

        Cheers - L~R

Re^7: Challenge: 8 Letters, Most Words
by Limbic~Region (Chancellor) on Oct 07, 2013 at 17:52 UTC
    LanX,
    And I'm stuck finding a formula which calculates all unique solutions

    If a letter was allowed to repeat up to the maximum 8 times, the formula is:

    (n + (k - 1))! -------------- k! (n - 1)! 33! -------- = 13_884_156 8! * 25!

    I am not sure how to get from 13_884_156 to 12_461_993 strictly through calculation but it seems like it should be possible.

    Cheers - L~R

      Hei L~R,

      I'm sure there is a formula, but I doubt it's a simple formula. (rather something with 26 subterms)

      I could dig into it for some time, but firing our script which just counts all combinations seems fine for me! :)

      So you are still obsessed with this problem? :)

      Cheers Rolf

      ( addicted to the Perl Programming Language)

        LanX,
        So you are still obsessed with this problem? :)

        I believe that if a formula exists, it would be <total> - <calculate for a> - <calculate for b> ... <calculate for z>.

        I am not really obsessed with the problem. I just like to learn and this was a fun distraction. Most of the heuristic solutions in this thread are ones I had considered and abandoned because I could think of a trivial case where it failed. It is also why I resorted to an exhaustive search. Not because a heuristic solution wouldn't be good enough or that it may be guaranteed to give the correct solution on any real dictionary but because I wanted to see if I could do it in a reasonable (less than 24 hours) amount of time. I was completely shocked that it only took 12 minutes on my home machine.

        Cheers - L~R

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://1057028]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others rifling through the Monastery: (8)
As of 2015-07-31 02:45 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    The top three priorities of my open tasks are (in descending order of likelihood to be worked on) ...









    Results (274 votes), past polls