1: =pod What is a Cartesian Cross Product?
2:
3: I think this is just too damn cool to pass up. If you don't
4: know what a Cartesian (Cross-) Product is, it's basically:
5:
6: A = (1,2,3)
7: B = (4,5)
8: CCP(A,B) =>
9: (1,4)
10: (1,5)
11: (2,4)
12: (2,5)
13: (3,4)
14: (3,5)
15:
16: Yay, that's all well and good. Here's how to implement the
17: Cartesian Product generator in Perl:
18:
19: =pod Explanation of algorithm used
20:
21: Given a list of sets, say ([a,b], [c,d,e], [f,g]), we first determine how
22: many sets can be created. Mathematically, this is determined as follows:
23:
24: For a list of sets, { a[1], a[2], ..., a[n] }, to determine how many sets
25: can be created by choosing an element from a[1] as the first element of a
26: set, an element of a[2] for the second element, and so on, picking an
27: element of a[n] as the n-th element, we create a list { s[1], s[2], ...,
28: s[n] }, where each element s[i] is the number of element in a[i]. We can
29: pick any of the s[i] elements from a[i] for the specified element in the
30: set to be created, so the number of sets to be created is
31:
32: n
33: -----
34: | | s[p] .
35: | |
36: p=1
37:
38: That is, the product of the sizes of all the sets.
39:
40: Now that we know how many sets we'll be creating, we start to populate these
41: sets. We modify the same index of each set per loop; that is, we modify
42: a[0][0], a[1][0], a[2][0], ..., a[n][0], before we modify any index in a[1].
43:
44: I utilize a "repetition value", which starts at 1, and is multiplied by the
45: size of the previous set (s[i-1]) when the population of a specific index of
46: the new sets is complete. The repetition value indicates how many times the
47: specific element will be inserted in a row on a pass over an index. The
48: starting value of 1 means that each element in a[0] will be inserted once, and
49: then the next element will be entered, and after all elements have been
50: exhausted, we go back to inserting a[0].
51:
52: After we've exhausted a[0], we multiply the repetition value by s[0], and we
53: move on to a[1]. For each value here, we fill in the next index in the new
54: sets, but we do this R times in succession, where R is the repetition value.
55:
56: We continue through until the new sets are completed.
57:
58: =cut
59:
60: sub cartesian {
61: my $len = 1;
62: my (@ret,$rep,$i,$j,$p,$k);
63:
64: for (@_) { $len *= @$_ }
65:
66: for ($rep = 1, $i = 0; $i < @_; $rep *= @{ $_[$i] }, $i++) {
67: for ($j = 0, $p = 0; $j < $len; $j += $rep, $p++) {
68: for ($k = 0; $k < $rep; $k++) {
69: print STDERR << "DEBUGGING" if 0; # set to true to see debug output
70: repetition value: $rep
71: modifying set[@{[ $j + $k]}], index[$i]
72: value is element @{[ $p % @{ $_[$i] } ]} ('$_[$i][$p % @{ $_[$i] }]') of original set[$i]
73:
74: DEBUGGING
75: $ret[$j + $k][$i] = $_[$i][$p % @{ $_[$i] }]
76: }
77: }
78: }
79:
80: return @ret;
81: }
82:
83: # uncomment to see a test run
84: # print map "@$_\n", cartesian( [1,2] , [3,4,5] , [6,7] );
Re: Cartesian Cross-Products by Anonymous Monk on Feb 12, 2005 at 02:01 UTC |
Add "$#ret = $len -1" at line 64 for better performance.
For a 100*65*20 set, benchmark results are:
Benchmark: timing 10 iterations of new, old...
new: 16 wallclock secs (15.59 usr + 0.09 sys = 15.68 CPU) @ 0.64/s (n=10)
old: 21 wallclock secs (20.16 usr + 0.10 sys = 20.26 CPU) @ 0.49/s (n=10)
| [reply] |
|
People who learned C should "learn" Perl...
This is the "Perl" way to do it.
perl -e '
use strict;
my @a=qw{1 2 3};
my @b=qw{4 5};
my @c=();
foreach my $a (@a) {
foreach my $b (@b) {
push @c, [$a, $b];
}
}
print join("\n", map {join ",", @$_} @c), "\n"
'
---
1,4
1,5
2,4
2,5
3,4
3,5
Mike (mrdvt92) | [reply] [d/l] |
|
It is indeed the Perl way to do it if you know in advance how many arrays you'd like to get the product of.
If you don't, Japhy's algorithm is the way to go - even if it looks C-ish with a lot of indices. I suspect that it may be faster as well, using the array pre-allocation optimisation pointed out earlier.
| [reply] |
|
sub cartesian {
my @C = map { [ $_ ] } @{ shift @_ };
foreach (@_) {
my @A = @$_;
@C = map { my $n = $_; map { [ $n, @$_ ] } @C } @A;
}
return @C;
}
| [reply] [d/l] |
|
|
|
|
|
|