Perl-Sensitive Sunglasses PerlMonks

### Re^6: Challenge: 8 Letters, Most Words

by LanX (Bishop)
 on Oct 05, 2013 at 13:19 UTC ( #1057028=note: print w/replies, xml ) Need Help??

in reply to Re^5: Challenge: 8 Letters, Most Words
in thread Challenge: 8 Letters, Most Words

> In other words, choose all combinations of any 8 chars out of the following string: "aaaabbbcccddddeeeeffffgggghhhiiijjkkkllllmmmnnnnoooopppqrrrrssssstttuuuuvvwwwxxyyyzzzz"

The point is to handle duplications, if all letters are unique the answer is simply (26 over 8)

```  DB<288> sub fac {my \$x=1; \$x*=\$_ for 2..\$_[0]; \$x}

DB<289> sub binom { my (\$n,\$k)=@_; fac(\$n)/(fac(\$n-\$k)*fac(\$k)) }

DB<290> binom 26,8
=> 1562275

Reasoning is simple, it calculates all binary vectors of length 26 with exactly 8 1-bits.

But with duplications its more complicated, e.g. 4 out of "aabcd" is not (5 over 4)=5

```a bcd
abcd
aa cd
aab d
aabc

cause the first 2 solutions are identical.

Generating all combinations and filtering the unique once is normally not very clever, cause the overhead can be enormous.

```  DB<292> length "aaaabbbcccddddeeeeffffgggghhhiiijjkkkllllmmmnnnnoooo
=> 86

DB<293> binom 86,8
=> "53060358690"

And I'm stuck finding a formula which calculates all unique solutions, but generating is easier, just don't allow bitvectors with "gaps" between identical letters:

so

``` "aaaabbbc..." # pattern
"1000100...." # ok      => ab...
"1100100...." # ok      => aab...
"1001100....  # not ok  => aab...

I will update this post with a loop generating all possibilities soon.

##### update

indeed L~R's number of 12461993 possibilities is correct

The following (non-optimized) code took 3 minutes to calculate them:

```use strict;
use warnings;
use Data::Dump qw/pp/;

my %max=(
a => 4,
b => 3,
c => 3,
d => 4,
e => 4,
f => 4,
g => 4,
h => 3,
i => 3,
j => 2,
k => 3,
l => 4,
m => 3,
n => 4,
o => 4,
p => 3,
q => 1,
r => 4,
s => 5,
t => 3,
u => 4,
v => 2,
w => 3,
x => 2,
y => 3,
z => 4,
);
my \$maxword=8;

#%max=(a=>2,b=>1,c=>1,d=>1);
#\$maxword=4;

my @keysort=sort keys %max;

my \$nsolutions=0;

sub generate {
my (\$idx,\$word)=@_;
#pp \@_;

return if \$idx >= @keysort;

my \$newword;

my \$key= \$keysort[\$idx];
for my \$n (0..\$max{\$key}){
#    print "\t"x\$idx ,"\$n x \$key\n";
my \$newword = \$word.(\$key x \$n);

if (\$maxword== length \$newword) {
#      print "-> \$newword\n";
\$nsolutions++;
last;
}

generate(\$idx+1, \$newword  )
}
}

generate(0,"");

print \$nsolutions;

Cheers Rolf

( addicted to the Perl Programming Language)

Replies are listed 'Best First'.
Re^7: Challenge: 8 Letters, Most Words
by Limbic~Region (Chancellor) on Oct 05, 2013 at 21:08 UTC
LanX,
The C code I wrote to calculate it is below. It finished nearly instantly. I was trying to determine how to write the rest of the code (could it fit in memory). It also ended up being part of my final solution since I realized it was unnecessary to keep them in memory using a high water mark algorithm.

Cheers - L~R

isn't the goal to find a clever Perl solution for the whole problem which does it within less than an hour? :)

I have two good ideas but no time to hack them. :(

Cheers Rolf

( addicted to the Perl Programming Language)

LanX,
Absolutely - but I am not that smart. I was hoping someone would provide a solution that was guaranteed to be correct that didn't require exhaustive searching. So far, all of the heuristic methods have flaws. That isn't to say I am not smart enough to come up with a heuristic solution but I needed to know for certain.

Cheers - L~R

Re^7: Challenge: 8 Letters, Most Words
by Limbic~Region (Chancellor) on Oct 07, 2013 at 17:52 UTC
LanX,
And I'm stuck finding a formula which calculates all unique solutions

If a letter was allowed to repeat up to the maximum 8 times, the formula is:

```(n + (k - 1))!
--------------
k! (n - 1)!

33!
--------      = 13_884_156
8! * 25!

I am not sure how to get from 13_884_156 to 12_461_993 strictly through calculation but it seems like it should be possible.

Cheers - L~R

Hei L~R,

I'm sure there is a formula, but I doubt it's a simple formula. (rather something with 26 subterms)

I could dig into it for some time, but firing our script which just counts all combinations seems fine for me! :)

So you are still obsessed with this problem? :)

Cheers Rolf

( addicted to the Perl Programming Language)

LanX,
So you are still obsessed with this problem? :)

I believe that if a formula exists, it would be <total> - <calculate for a> - <calculate for b> ... <calculate for z>.

I am not really obsessed with the problem. I just like to learn and this was a fun distraction. Most of the heuristic solutions in this thread are ones I had considered and abandoned because I could think of a trivial case where it failed. It is also why I resorted to an exhaustive search. Not because a heuristic solution wouldn't be good enough or that it may be guaranteed to give the correct solution on any real dictionary but because I wanted to see if I could do it in a reasonable (less than 24 hours) amount of time. I was completely shocked that it only took 12 minutes on my home machine.

Cheers - L~R

Create A New User
Node Status?
node history
Node Type: note [id://1057028]
help
Chatterbox?
and all is quiet...

How do I use this? | Other CB clients
Other Users?
Others exploiting the Monastery: (5)
As of 2017-12-14 00:23 GMT
Sections?
Information?
Find Nodes?
Leftovers?
Voting Booth?
What programming language do you hate the most?

Results (381 votes). Check out past polls.

Notices?