in reply to possible combinations in sequence

My basic idea is to map the array indices to bits in a binary number. If a bit is on, you take that element out of the source array. For example:
``` 0 = 0b0000 --> {nothing}
3 = 0b0011 --> 'horse:cow'
13 = 0b1101 --> 'horse:dog:cat'
The algorithm then simply becomes a loop over 1 .. 2**@kw -1, testing the bits for each number.

Here's my first implementation of it. It's probably not as efficient as possible yet.

```sub rhesa {
# initial source in sequence order
my \$source = 'horse:cow:dog:cat';

my @kw = split /:/, \$source;
my @res;

for my \$i( 1 .. 2**@kw - 1 ) {
my @ar; my \$t;
while( \$i > 0 ) {
push @ar, \$kw[\$t] if \$i & 1;
\$i >>= 1; \$t++;
}
push @res, join ':', @ar;
}

return @res;
}
I'm a bit irritated with the number of temporary variables, but I can't think of anything prettier just now. Hope it helps :)

BTW, a simple Benchmark comparison showed a 200% speed increase over your version.

Replies are listed 'Best First'.
Re^2: possible combinations in sequence
by ikegami (Pope) on Jun 09, 2006 at 00:45 UTC

Yours is also faster (albeit only 7% faster) than my grep approach, but my regexp approach is 13% faster than yours.

```           Rate ikegami1    ruzam ikegami2    rhesa ikegami3
ikegami1 2381/s       --     -19%     -66%     -68%     -72%
ruzam    2944/s      24%       --     -58%     -61%     -65%
ikegami2 7072/s     197%     140%       --      -5%     -16%
rhesa    7478/s     214%     154%       6%       --     -11%
ikegami3 8420/s     254%     186%      19%      13%       --
You guys are so beyond awesome! ikegami3 is nothing short of brilliance :) ++ to ikegami. rhesa, and liverpole.
Thanks to ikegami's benchmark, I ran my own benchmarks. I excluded ikegami1 simply because of the 'Algorithm::Loops' dependency. Then just for personal interest, I copied ikegami3 and replaced the '\$parts - 1' parts:
```sub ikegami3x {
local \$_ = ":\$_[0]:";

my \$parts = tr/:/:/ - 2; # take 2 here instead of -1 later

my \$re = '(?{ "" })'
. '(:[^:]*)(?=:)(?{ \$^R . \$^N })'
. '(?:.*(:[^:]*)(?=:)(?{ \$^R . \$^N })' x \$parts
. ')?' x \$parts
. '(?{ push @rv, substr(\$^R, 1) })'
. '(?!)';

{ use re 'eval'; \$re = qr/\$re/; }

local our @rv;
/\$re/;
return @rv;
}

I also included rhesa2 with a slight change to eliminate 'uninitialized' warnings
```sub rhesa2 {
my @kw = split /:/, \$_[0];
map {
my @ar;
my \$t = 0; # initialize \$t
do {
(\$_ & 1) and push @ar, \$kw[\$t];
\$t++;
} while (\$_ >>= 1);
join ':', @ar;
} ( 1 .. 2**@kw - 1 );
}

I evened up all test functions to use \$_[0], and finally I ran tests against different 'word counts' of the source (in actual use, \$source will contain varying numbers of words).
These are my benchmark results (I've run this several times to come up with more or less the same results)
```source: horse:cow:dog:cat
Rate  ruzam  ikegami3  ikegami2  ikegami3x  rhesa2  rhesa
ruzam      4620/s     --     -61%      -61%       -61%    -66%   -66%
ikegami3  11764/s   155%       --       -0%        -1%    -12%   -14%
ikegami2  11819/s   156%       0%        --        -1%    -12%   -13%
ikegami3x 11935/s   158%       1%        1%         --    -11%   -13%
rhesa2    13444/s   191%      14%       14%        13%      --    -2%
rhesa     13657/s   196%      16%       16%        14%      2%     --

source: horse
Rate ikegami3  ikegami3x  ruzam  ikegami2  rhesa  rhesa2
ikegami3   40841/s      --        -1%    -37%     -62%   -64%    -72%
ikegami3x  41226/s      1%         --    -37%     -62%   -64%    -72%
ruzam      65317/s     60%        58%      --     -39%   -43%    -55%
ikegami2  107178/s    162%       160%     64%       --    -6%    -26%
rhesa     114470/s    180%       178%     75%       7%     --    -21%
rhesa2    145232/s    256%       252%    122%      36%    27%      --

source: horse:cat
Rate  ruzam  ikegami3  ikegami3x  ikegami2  rhesa  rhesa2
ruzam     26853/s     --      -2%        -7%      -48%   -53%    -58%
ikegami3  27324/s     2%       --        -5%      -47%   -52%    -58%
ikegami3x 28732/s     7%       5%         --      -45%   -50%    -55%
ikegami2  51965/s    94%      90%        81%        --    -9%    -19%
rhesa     57233/s   113%     109%        99%       10%     --    -11%
rhesa2    64472/s   140%     136%       124%       24%    13%      --

source: horse:cow:cat
Rate  ruzam  ikegami3 ikegami3x  ikegami2  rhesa  rhesa2
ruzam     10772/s     --     -41%      -42%      -58%   -61%    -64%
ikegami3  18305/s    70%       --       -1%      -28%   -33%    -38%
ikegami3x 18436/s    71%       1%        --      -27%   -33%    -38%
ikegami2  25353/s   135%      39%       38%        --    -7%    -15%
rhesa     27363/s   154%      49%       48%        8%     --     -8%
rhesa2    29753/s   176%      63%       61%       17%     9%      --

source: horse:cow:dog:cat:mouse
Rate  ruzam  rhesa  ikegami2  rhesa2  ikegami3x  ikegami3
ruzam     1632/s     --  -67%      -68%    -73%       -75%      -75%
rhesa     5021/s   208%    --       -3%    -17%       -24%      -24%
ikegami2  5159/s   216%    3%        --    -14%       -22%      -22%
rhesa2    6023/s   269%   20%       17%      --        -9%       -9%
ikegami3x 6614/s   305%   32%       28%     10%         --       -0%
ikegami3  6634/s   307%   32%       29%     10%         0%        --

I can't nail down the box so the results can fluctuate quite a bit from test to test, but overall these seem to be consistent. rhesa2 takes the lead up to 4 words, ikegami3 takes over at 5 words (and even more so at 6 words). rhesa2 rocks in the low word counts, where as ikegami3 seems to have more overhead. In my real world use, the word count is usually 4 or less (4 was just a good example size), so rhesa2 wins and replaceses my original ruzam.

The name of the variable doesn't match the value it contains in
my \$parts = tr/:/:/ - 2; # take 2 here instead of -1 later
and it doesn't give you anything. That was a step in the wrong direction.

Your use of grep is adorable! I think it's by far the most readable version, with the clearest exposition of intent.

The regexp on the other hand... the approach certainly is neat, i'll give you that ;) ++ for speed and ingenuity, but ouch does my head spin!

ikegami,
I was tired last night when I found this thread but I wanted to point out Finding all Combinations. I would be interested in seeing how the ones that produce the correct order compare (specifically mine).

Cheers - L~R

Re^2: possible combinations in sequence
by liverpole (Monsignor) on Jun 09, 2006 at 00:45 UTC

Here's my somewhat golfed variation:

```sub rhesa2 {
my \$source = shift;
my @kw = split /:/, \$source;
map {
my (@ar, \$t);
do {
(\$_ & 1) and push @ar, \$kw[\$t];
\$t++;
} while (\$_ >>= 1);
join ':', @ar
} ( 1 .. 2**@kw - 1 )
}

my @res = rhesa2('horse:cow:dog:cat');

s''(q.S:\$/9=(T1';s;(..)(..);\$..=substr+crypt(\$1,\$2),2,3;eg;print\$..\$/
Re^2: possible combinations in sequence
by grinder (Bishop) on Jun 09, 2006 at 05:46 UTC
My basic idea is to map the array indices to bits in a binary number. If a bit is on, you take that element out of the source array.

This is precisely the same approach that I used in Data::PowerSet, for indeed, what the OP is looking for is the power set of the list.

• another intruder with the mooring in the heart of the Perl