in reply to Re^5: Challenge: 8 Letters, Most Words in thread Challenge: 8 Letters, Most Words
> In other words, choose all combinations of any 8 chars out of the following string: "aaaabbbcccddddeeeeffffgggghhhiiijjkkkllllmmmnnnnoooopppqrrrrssssstttuuuuvvwwwxxyyyzzzz"
The point is to handle duplications, if all letters are unique the answer is simply (26 over 8)
DB<288> sub fac {my $x=1; $x*=$_ for 2..$_[0]; $x}
DB<289> sub binom { my ($n,$k)=@_; fac($n)/(fac($n$k)*fac($k)) }
DB<290> binom 26,8
=> 1562275
Reasoning is simple, it calculates all binary vectors of length 26 with exactly 8 1bits.
But with duplications its more complicated, e.g. 4 out of "aabcd" is not (5 over 4)=5
a bcd
abcd
aa cd
aab d
aabc
cause the first 2 solutions are identical.
Generating all combinations and filtering the unique once is normally not very clever, cause the overhead can be enormous.
DB<292> length "aaaabbbcccddddeeeeffffgggghhhiiijjkkkllllmmmnnnnoooo
+pppqrrrrssssstttuuuuvvwwwxxyyyzzzz"
=> 86
DB<293> binom 86,8
=> "53060358690"
And I'm stuck finding a formula which calculates all unique solutions, but generating is easier, just don't allow bitvectors with "gaps" between identical letters:
so
"aaaabbbc..." # pattern
"1000100...." # ok => ab...
"1100100...." # ok => aab...
"1001100.... # not ok => aab...
I will update this post with a loop generating all possibilities soon.
update
indeed L~R's number of 12461993 possibilities is correct
The following (nonoptimized) code took 3 minutes to calculate them:
use strict;
use warnings;
use Data::Dump qw/pp/;
my %max=(
a => 4,
b => 3,
c => 3,
d => 4,
e => 4,
f => 4,
g => 4,
h => 3,
i => 3,
j => 2,
k => 3,
l => 4,
m => 3,
n => 4,
o => 4,
p => 3,
q => 1,
r => 4,
s => 5,
t => 3,
u => 4,
v => 2,
w => 3,
x => 2,
y => 3,
z => 4,
);
my $maxword=8;
#%max=(a=>2,b=>1,c=>1,d=>1);
#$maxword=4;
my @keysort=sort keys %max;
my $nsolutions=0;
sub generate {
my ($idx,$word)=@_;
#pp \@_;
return if $idx >= @keysort;
my $newword;
my $key= $keysort[$idx];
for my $n (0..$max{$key}){
# print "\t"x$idx ,"$n x $key\n";
my $newword = $word.($key x $n);
if ($maxword== length $newword) {
# print "> $newword\n";
$nsolutions++;
last;
}
generate($idx+1, $newword )
}
}
generate(0,"");
print $nsolutions;
Cheers Rolf
( addicted to the Perl Programming Language)
Re^7: Challenge: 8 Letters, Most Words by Limbic~Region (Chancellor) on Oct 05, 2013 at 21:08 UTC 
LanX,
The C code I wrote to calculate it is below. It finished nearly instantly. I was trying to determine how to write the rest of the code (could it fit in memory).
It also ended up being part of my final solution since I realized it was unnecessary to keep them in memory using a high water mark algorithm.
 [reply] [d/l] 

 [reply] 

LanX,
Absolutely  but I am not that smart. I was hoping someone would provide a solution that was guaranteed to be correct that didn't require exhaustive searching. So far, all of the heuristic methods have flaws. That isn't to say I am not smart enough to come up with a heuristic solution but I needed to know for certain.
 [reply] 

Re^7: Challenge: 8 Letters, Most Words by Limbic~Region (Chancellor) on Oct 07, 2013 at 17:52 UTC 
LanX,
And I'm stuck finding a formula which calculates all unique solutions
If a letter was allowed to repeat up to the maximum 8 times, the formula is:
(n + (k  1))!

k! (n  1)!
33!
 = 13_884_156
8! * 25!
I am not sure how to get from 13_884_156 to 12_461_993 strictly through calculation but it seems like it should be possible.
 [reply] [d/l] 

Hei L~R,
I'm sure there is a formula, but I doubt it's a simple formula. (rather something with 26 subterms)
I could dig into it for some time, but firing our script which just counts all combinations seems fine for me! :)
So you are still obsessed with this problem? :)
Cheers Rolf
( addicted to the Perl Programming Language)
 [reply] 

LanX,
So you are still obsessed with this problem? :)
I believe that if a formula exists, it would be <total>  <calculate for a>  <calculate for b> ... <calculate for z>.
I am not really obsessed with the problem. I just like to learn and this was a fun distraction. Most of the heuristic solutions in this thread are ones I had considered and abandoned because I could think of a trivial case where it failed. It is also why I resorted to an exhaustive search. Not because a heuristic solution wouldn't be good enough or that it may be guaranteed to give the correct solution on any real dictionary but because I wanted to see if I could do it in a reasonable (less than 24 hours) amount of time. I was completely shocked that it only took 12 minutes on my home machine.
 [reply] 

