Anonymous Monk has asked for the
wisdom of the Perl Monks concerning the following question:
Dear monks,
you all know of wordgames like Wordfeud and Scrabble.
At the very beginning of such a game there is a bag full of tiles.
In English Wordfeud for example there are 10 A:s, 2 B:s etc.
Now  how to list all possible racks in the initial stage? A rack consists of 7 tiles, and the order of the tiles at hand are of course of no importance.
And, additionally compute the probability of each rack.
Something like this
AAAAAAA 0.00x probability
AAAAAAB 0.000x probability
... etc ...
To list the possible racks I have already looked at some CPAN modules, but most modules seems to handle only cases where there is only one element of each kind in the set from which to draw. The one I found that could handle this Math::Combinatorics was so ridicilously slow it would probably take days or even weeks to generate the list!
So now I turn to thee, monks, for spiritual Perl advice.
/ L
Re: Wordfeud racks by daxim (Chaplain) on Jul 30, 2013 at 11:58 UTC 
Use Algorithm::Combinatorics, that's written in XS.
Edit: you can make your problem easier by mapping your multiset/bag into a set first, so that the simpler algorithms can work: A,A,A,A,A,A,A,A,A,A,B,B,… ⤖ A1,A2,A3,A4,A5,A6,A7,A8,A9,A10,B1,B2,…  [reply] 

Thanks for your suggestion, But I can't see how to make A::C list only unique racks.
Example
If my bag of tiles contained only 3 tiles A A B
And I wanted to list all possible racks.
I would end up with only 2 racks:
AA
AB
A::C will give me all combinations of the "physical" tiles:
A_{1}A_{2}
A_{1}B
A_{2}B
Can you please elaborate a little on how to use the mapping to solve this problem?
/L
 [reply] 

 [reply] 

#!/usr/bin/perl
use warnings;
use strict;
use 5.010; # definedor, say
my @tiles = (('_') x 2, # 2 blank tiles (scoring 0 points)
# 1 point
('E') x 12,
qw(A I) x 9,
('O') x 8,
qw(N R T) x 6,
qw(L S U) x 4,
# 2 points
('D') x 4,
('G') x 3,
# 3 points
qw(B C M P) x 2,
# 4 points
qw(F H V W Y) x 2,
# 5 points
('K'),
# 8 points
qw(J X),
# 10 points
qw(Q Z));
my $rack_size = 5;
my %hash;
my @rack = 0 .. $rack_size  1;
while ($rack[1] != @tiles) {
my $string = join q(), map $tiles[$_], @rack;
$hash{$string}++;
my $first_to_move = 0;
$first_to_move++ while $rack[$first_to_move] + 1 == ($rack[$first_
+to_move + 1] // 1);
$rack[$first_to_move]++;
for my $i (0 .. $first_to_move  1) {
$rack[$i] = $i;
}
}
my $combinations = keys %hash;
my $sum = 0;
$sum += $_ for values %hash;
print map "$_ => " . ($hash{$_} / $sum) . "\n", sort { $hash{$a} <=> $
+hash{$b} } keys %hash;
print "Combinations: $combinations, Count: $sum.\n";
With the module, the important part of the code simplifies to:
use Algorithm::Combinatorics qw(combinations);
my %hash;
my $iterator = combinations(\@tiles, $rack_size);
while (my $rack = $iterator>next) {
$hash{ join q(), @$rack }++;
}
Update: I did some benchmarks. It seems the module would get faster than my code on racks greater than 5, but it will still run for days.
Update 2: Fixed a bug in the code.
 [reply] [d/l] [select] 

performance tip, you could presize the hash, so there isn't need to grow it
 [reply] 

I ran the code for rack size 7. It took 12 hours 6 mins 18 secs on a 4 core Xeon box with 8 GB memory, OS linux. I am not going to copy the whole output here, just a few first and last lines:
MMVFWXQ => 6.24704795748769e11
__BPMKQ => 6.24704795748769e11
BCBMYHV => 6.24704795748769e11
...
EEAIOND => 3.56231662727778e05
EEAIONU => 3.56231662727778e05
EEAIONL => 3.56231662727778e05
Combinations: 8275164, Count: 16007560800.
I probably have an error in the code :)
Update: I see it now. The error was my effort to "DRY". The @tiles array must be defined as follows to make it work:
my @tiles = (('_') x 2, # 2 blank tiles (scoring 0 points)
# 1 point
('E') x 12,
('A') x 9, ('I') x 9,
('O') x 8,
('N') x 6, ('R') x 6, ('T') x 6,
('L') x 4, ('S') x 4, ('U') x 4,
# 2 points
('D') x 4,
('G') x 3,
# 3 points
('B') x 2, ('C') x 2, ('M') x 2, ('P') x 2,
# 4 points
('F') x 2, ('H') x 2, ('V') x 2, ('W') x 2, ('Y') x 2,
# 5 points
('K'),
# 8 points
qw(J X),
# 10 points
qw(Q Z));
Stay tuned, the next result is comming in 12 hours.
 [reply] [d/l] [select] 

Re: Wordfeud racks (Algorithm::Permute) by Anonymous Monk on Jul 30, 2013 at 12:02 UTC 
 [reply] 

No, that is about permutations, where order matters.
What I want is all unique combinations...
/L
 [reply] 

 [reply] 


Re: Wordfeud racks by BillKSmith (Hermit) on Jul 30, 2013 at 13:31 UTC 
Have you estimated how many racks are possible? How long do you expect it takes to compute them all? Your extrapalation to days or weeks may be very reasonable.
 [reply] 

 [reply] 
Re: Wordfeud racks by RichardK (Curate) on Jul 30, 2013 at 15:31 UTC 
How many tiles are in the bag?
for example, if it's only 100 then you're going to have to create 80,678,106,432,000 unique combinations  which might take a while :)
 [reply] 

$ echo $(echo {2..100}  sed 's/ /*/g') \
/ $(echo {2..7}  sed 's= =/=g') \
/ $(echo {2..93}  sed 's= =/=g')  bc
16007560800
Also confirmed at http://www.mathsisfun.com/combinatorics/combinationspermutationscalculator.html and http://easycalculation.com/statistics/permutationcombination.php.
 [reply] [d/l] 

Sorry, I meant permutations, so there a N!/(NK)! ways to select K tiles from a bag of N.
Or for 7 of 100 tiles = 100*99*98*97*96*95*94 permutations.
I think it will be easier to calculate the probabilities of the different combinations rather than count them, but it's still going to be a fair amount of work.
 [reply] 
Re: Wordfeud racks by sundialsvc4 (Monsignor) on Jul 30, 2013 at 17:11 UTC 
“All possible racks,” in such a game, is an infeasible number of possibilities to actually enumerate ... it would take weeks at best, if not months or centuries. But you don’t need to “list them, then count them,” in order to compute the probability of any given draw.
If you first presume that “a unique draw” would be one in which the letters are always arranged in alphabetic order, so that AABA and ABAA are considered to be the same, then any particular draw could be equally expressed as a vector of 26 elements, such as (in this case) [3, 1, 0, 0 ...]. There are two rules to determine if a vector is valid: (a) the total of all elements in the vector must equal the total size of the draw, and (b) the value of the element in any given position must not exceed the defined total number of pieces for that position (letter).
It is practical to construct a computer program to map out all of these combinations, since it is possible to eliminate billions of choices by recognizing as soon as possible when either of these two constraints have been exceeded. (As soon as you know that, cutshort the search.) Notice that there is nothing at all “random” about the algorithm. The probability of any given vector is simply 1/n, where n is the number of unique vectors that are found to exist. A similar technique can be used to solve the Eight Queens problem.
It could be that what you’re actually interested in is: “given a valid vector v, how many arrangements of letters correspond to that vector?” This is a calculation that involves factorials (n!). The answer is [ F(v) ::= ( v[1]! * v[2]! * ... * v[n]! ) ] for any given vector v of size n, no matter which particular letters might be involved. Yes, having perfected the algorithm to give you the list of all possible vectors, with the help of bigint, you could also solve that problem. The denominator is [ SUM( f(v) ) for all v ], and the numerator is simply f(v).
 [reply] 

“All possible racks,” in such a game, is an infeasible number of possibilities to actually enumerate ... it would take weeks at best, if not months or centuries. But you don’t need to “list them, then count them,” in order to compute the probability of any given draw.
I don't think so. Even assuming the worsecase scenario that the bag has at least 7 copies of each letter, the number of possible ordered racks is at most 26**7, or about 8 billion (we absolutely do not care whether the A that you get is A1, or A2, or A3..., it is just an A). This is admitedly a big number, but you certainly don't need weeks (let alone centuries) to list them all on a computer by the standards of today. And the actual number of racks is in fact much smaller for two reasons: some letters have only possibly two or three copies and that reduces quite drastically the number of actual possibilities, and many racks are actually equivalent (permutations of another one). I do not have enough information on the game to compute everything, but I would be surprised if the number of actually different racks would exceed a couple of millions. So at most a few seconds to enumerate them all on a modern computer.
 [reply] 

 [reply] 

Re: Wordfeud racks by Anonymous Monk on Jul 31, 2013 at 00:06 UTC 
There are only 15 different types of racks to consider: 7 different letters (1111111), 6 different letters (211111), 5 different letters (31111 or 22111), etc, all the way down to a rack with all identical letters.
Assuming the bag contains at least 7 tiles for each letter (and we have 26 different tiles, that is, no blanks), it's easy to calculate how many different racks you can make for each type. For instance, type 111111 give you 26*25*24*23*22*21*20/7*6*5*4*3*2*1 == 657800 different racks. Six different letters on the board gives you the most combinations: 26*25*24*23*22*21*6/(5*4*3*2*1) == 8288280. Etc.
I leave it up to you to calculate the other numbers, but it quickly peters down. The number of different racks to consider is in the order of tens of millions, (less than 20 million I guess).
Enumerating all possible racks, given an sufficient number of tiles is doable. And given that list and the actual distribution, you can calculate the actual odds (some will have probability, as I doubt Wordfeud has 7 Zs in the bag).
Of course, you may want to be smart, and just derive the right number, using some combinatoric arguments, but for many people here in this forum, doing it bruteforce is going to be easier.  [reply] 
Re: Wordfeud racks by karlgoethebier (Deacon) on Jul 31, 2013 at 09:39 UTC 
 [reply] 
Re: Wordfeud racks by Anonymous Monk on Aug 02, 2013 at 21:43 UTC 
Thanks to all who read my question and made an effort to actually understand the problem and provided a wellconsidered answer. I learnt something from choroba:s code, even though it took me quite a while to understand what was going on esp. in the "$first_to_move"section...
I eventually ended up in a completly different direction writing the following code. (and borrowing the nCr function from here 68077 )
Just listing the 3199724 racks takes about 6 seconds on my old laptop running Win7 32 bit. When also computing the probabilities it toke 2 minutes and 46 seconds to complete.
/L  [reply] [d/l] 

