Re: possible combinations in sequence
by ikegami (Patriarch) on Jun 08, 2006 at 23:33 UTC
|
use Algorithm::Loops qw( NestedLoops );
my $source = 'horse:cow:dog:cat';
my @parts = split(/:/, $source);
my $iter = NestedLoops(
[
[ 0..$#parts ],
( sub { [ $_+1..$#parts ] } ) x $#parts,
],
{ OnlyWhen => 1 },
);
my @s;
print(join(':', map $parts[$_], @s), "\n")
while @s = $iter->();
Update: Even better:
my $source = 'horse:cow:dog:cat';
my @parts = split(/:/, $source);
for my $comb (1..2**@parts-1) {
my $s = join ':',
map $parts[$_],
grep $comb & (1<<$_),
0..$#parts;
print("$s\n");
}
Update: Neat, and even faster:
my $source = 'horse:cow:dog:cat';
local $_ = ":$source:";
my $parts = tr/:/:/ - 1;
my $re = '(?{ "" })'
. '(:[^:]*)(?=:)(?{ $^R . $^N })'
. '(?:.*(:[^:]*)(?=:)(?{ $^R . $^N })' x ($parts-1)
. ')?' x ($parts-1)
. '(?{ push @rv, substr($^R, 1) })'
. '(?!)';
{ use re 'eval'; $re = qr/$re/; }
local our @rv;
/$re/;
print "$_\n" foreach @rv;
| [reply] [d/l] [select] |
Re: possible combinations in sequence
by rhesa (Vicar) on Jun 08, 2006 at 23:41 UTC
|
My basic idea is to map the array indices to bits in a binary number. If a bit is on, you take that element out of the source array. For example:
0 = 0b0000 --> {nothing}
3 = 0b0011 --> 'horse:cow'
13 = 0b1101 --> 'horse:dog:cat'
The algorithm then simply becomes a loop over 1 .. 2**@kw -1, testing the bits for each number.
Here's my first implementation of it. It's probably not as efficient as possible yet.
sub rhesa {
# initial source in sequence order
my $source = 'horse:cow:dog:cat';
my @kw = split /:/, $source;
my @res;
for my $i( 1 .. 2**@kw - 1 ) {
my @ar; my $t;
while( $i > 0 ) {
push @ar, $kw[$t] if $i & 1;
$i >>= 1; $t++;
}
push @res, join ':', @ar;
}
return @res;
}
I'm a bit irritated with the number of temporary variables, but I can't think of anything prettier just now. Hope it helps :)
BTW, a simple Benchmark comparison showed a 200% speed increase over your version. | [reply] [d/l] [select] |
|
Rate ikegami1 ruzam ikegami2 rhesa ikegami3
ikegami1 2381/s -- -19% -66% -68% -72%
ruzam 2944/s 24% -- -58% -61% -65%
ikegami2 7072/s 197% 140% -- -5% -16%
rhesa 7478/s 214% 154% 6% -- -11%
ikegami3 8420/s 254% 186% 19% 13% --
| [reply] [d/l] [select] |
|
You guys are so beyond awesome! ikegami3 is nothing short of brilliance :) ++ to ikegami. rhesa, and liverpole.
Thanks to ikegami's benchmark, I ran my own benchmarks. I excluded ikegami1 simply because of the 'Algorithm::Loops' dependency. Then just for personal interest, I copied ikegami3 and replaced the '$parts - 1' parts:
sub ikegami3x {
local $_ = ":$_[0]:";
my $parts = tr/:/:/ - 2; # take 2 here instead of -1 later
my $re = '(?{ "" })'
. '(:[^:]*)(?=:)(?{ $^R . $^N })'
. '(?:.*(:[^:]*)(?=:)(?{ $^R . $^N })' x $parts
. ')?' x $parts
. '(?{ push @rv, substr($^R, 1) })'
. '(?!)';
{ use re 'eval'; $re = qr/$re/; }
local our @rv;
/$re/;
return @rv;
}
I also included rhesa2 with a slight change to eliminate 'uninitialized' warnings
sub rhesa2 {
my @kw = split /:/, $_[0];
map {
my @ar;
my $t = 0; # initialize $t
do {
($_ & 1) and push @ar, $kw[$t];
$t++;
} while ($_ >>= 1);
join ':', @ar;
} ( 1 .. 2**@kw - 1 );
}
I evened up all test functions to use $_[0], and finally I ran tests against different 'word counts' of the source (in actual use, $source will contain varying numbers of words).
These are my benchmark results (I've run this several times to come up with more or less the same results)
source: horse:cow:dog:cat
Rate ruzam ikegami3 ikegami2 ikegami3x rhesa2 rhesa
ruzam 4620/s -- -61% -61% -61% -66% -66%
ikegami3 11764/s 155% -- -0% -1% -12% -14%
ikegami2 11819/s 156% 0% -- -1% -12% -13%
ikegami3x 11935/s 158% 1% 1% -- -11% -13%
rhesa2 13444/s 191% 14% 14% 13% -- -2%
rhesa 13657/s 196% 16% 16% 14% 2% --
source: horse
Rate ikegami3 ikegami3x ruzam ikegami2 rhesa rhesa2
ikegami3 40841/s -- -1% -37% -62% -64% -72%
ikegami3x 41226/s 1% -- -37% -62% -64% -72%
ruzam 65317/s 60% 58% -- -39% -43% -55%
ikegami2 107178/s 162% 160% 64% -- -6% -26%
rhesa 114470/s 180% 178% 75% 7% -- -21%
rhesa2 145232/s 256% 252% 122% 36% 27% --
source: horse:cat
Rate ruzam ikegami3 ikegami3x ikegami2 rhesa rhesa2
ruzam 26853/s -- -2% -7% -48% -53% -58%
ikegami3 27324/s 2% -- -5% -47% -52% -58%
ikegami3x 28732/s 7% 5% -- -45% -50% -55%
ikegami2 51965/s 94% 90% 81% -- -9% -19%
rhesa 57233/s 113% 109% 99% 10% -- -11%
rhesa2 64472/s 140% 136% 124% 24% 13% --
source: horse:cow:cat
Rate ruzam ikegami3 ikegami3x ikegami2 rhesa rhesa2
ruzam 10772/s -- -41% -42% -58% -61% -64%
ikegami3 18305/s 70% -- -1% -28% -33% -38%
ikegami3x 18436/s 71% 1% -- -27% -33% -38%
ikegami2 25353/s 135% 39% 38% -- -7% -15%
rhesa 27363/s 154% 49% 48% 8% -- -8%
rhesa2 29753/s 176% 63% 61% 17% 9% --
source: horse:cow:dog:cat:mouse
Rate ruzam rhesa ikegami2 rhesa2 ikegami3x ikegami3
ruzam 1632/s -- -67% -68% -73% -75% -75%
rhesa 5021/s 208% -- -3% -17% -24% -24%
ikegami2 5159/s 216% 3% -- -14% -22% -22%
rhesa2 6023/s 269% 20% 17% -- -9% -9%
ikegami3x 6614/s 305% 32% 28% 10% -- -0%
ikegami3 6634/s 307% 32% 29% 10% 0% --
I can't nail down the box so the results can fluctuate quite a bit from test to test, but overall these seem to be consistent. rhesa2 takes the lead up to 4 words, ikegami3 takes over at 5 words (and even more so at 6 words). rhesa2 rocks in the low word counts, where as ikegami3 seems to have more overhead. In my real world use, the word count is usually 4 or less (4 was just a good example size), so rhesa2 wins and replaceses my original ruzam.
| [reply] [d/l] [select] |
|
|
| [reply] |
|
ikegami,
I was tired last night when I found this thread but I wanted to point out Finding all Combinations. I would be interested in seeing how the ones that produce the correct order compare (specifically mine).
| [reply] |
|
sub rhesa2 {
my $source = shift;
my @kw = split /:/, $source;
map {
my (@ar, $t);
do {
($_ & 1) and push @ar, $kw[$t];
$t++;
} while ($_ >>= 1);
join ':', @ar
} ( 1 .. 2**@kw - 1 )
}
my @res = rhesa2('horse:cow:dog:cat');
s''(q.S:$/9=(T1';s;(..)(..);$..=substr+crypt($1,$2),2,3;eg;print$..$/
| [reply] [d/l] |
|
| [reply] |
Re: possible combinations in sequence
by roboticus (Chancellor) on Jun 09, 2006 at 04:38 UTC
|
ruzam:
Here's my stab at it. I took out the hash because my method doesn't generate duplicates. Instead I just return the list of results:
#!/usr/bin/perl -w
use strict;
use warnings;
# initial source in sequence order
my $source = 'horse:cow:dog:cat';
function($source);
# and now the results
foreach (function($source)) {
print "$_\n";
}
# generate array of combinations
sub function {
my @t = split /:/, shift;
my @res=(shift @t);
for my $i (@t) {
@res=($i, @res, map{$_.':'.$i} @res);
}
return @res;
}
UPDATE: I didn't benchmark it because I've never used that module before. (I'll have to go install it and read up on it.) But I submitted it because I suspect that generating the strings from components might be faster than removing chunks. Any benchmarking ninjas out there wanna help me out?
--roboticus | [reply] [d/l] |
|
If you want a huge speed boost, replace
@rv=($i, @rv, map{$_.':'.$i} @rv);
with
push @rv, $i, map{$_.':'.$i} @rv;
By the way, I found confusing the use of $i for something which is not an index and not even numerical. $part would have worked fine, since it's an element of @parts. Well, it was named @parts before you renamed to worthless @t. Nice algorithm, but shoddy code.
| [reply] [d/l] [select] |
|
ikegami:
Thanks for the tip on the speed boost. Once I get the benchmarking stuff installed I'll play with it. While I do like your suggestion, I prefer the order that my method generates--all one-word combinations first, then the two-word combinations, etc.)
Re: shoddy code. Yeah, I guess so, consider me properly admonished. ++ for calling me on it and keeping me honest. When I thunk up the technique to use, I just erased the original function body and started whacking away at it. I didn't think to clarify things by using better variable names. (Of course, I just came off of a little golfing trip so my head was in "trim keystrokes" mode.</lame_excuse_mode>) Now, I guess the proper thing to do is to clean it up a little and insert your suggestion, so here goes:
<pedagogical_mode>
sub function {
my @parts = split /:/, shift;
# Null is the complete list of combinations for
# an empty word list
my @combinations=();
# Sequentially (recursively with tail recursion
# removed) rebuild the combination list adding one
# new word each iteration
for my $new_word (@parts) {
# Given a complete set of combinations for a
# given list of words, we can add a new word to
# the list and generate all new valid combinations
# by concatenating to the original list:
push @combinations,
# the new word (a single word is a valid
# combination)
$new_word,
# and the original list with the new word
# glommed onto the end of each member
map {$_.':'.$new_word} @combinations
;
}
return @combinations;
}
</pedagogical_mode>
--roboticus | [reply] [d/l] |
|
|
|
Whollopin Websnappers!
Taking a decisive lead, and proving yet again that simplicity is beauty: roboticus++
source: horse:cow:dog:cat
Rate ruzam ikegami3 rhesa rhesa2 roboticus
ruzam 4655/s -- -62% -66% -66% -80%
ikegami3 12238/s 163% -- -10% -11% -47%
rhesa 13612/s 192% 11% -- -1% -41%
rhesa2 13742/s 195% 12% 1% -- -40%
roboticus 22986/s 394% 88% 69% 67% --
source: horse
Rate ikegami3 ruzam rhesa rhesa2 roboticus
ikegami3 41518/s -- -36% -64% -72% -84%
ruzam 64752/s 56% -- -43% -56% -75%
rhesa 113875/s 174% 76% -- -22% -56%
rhesa2 146152/s 252% 126% 28% -- -43%
roboticus 257311/s 520% 297% 126% 76% --
source: horse:cat
Rate ruzam ikegami3 rhesa rhesa2 roboticus
ruzam 27050/s -- -3% -53% -59% -72%
ikegami3 27836/s 3% -- -52% -57% -71%
rhesa 57415/s 112% 106% -- -12% -40%
rhesa2 65371/s 142% 135% 14% -- -32%
roboticus 96089/s 255% 245% 67% 47% --
source: horse:cow:cat
Rate ruzam ikegami3 rhesa rhesa2 roboticus
ruzam 10920/s -- -43% -61% -64% -76%
ikegami3 19183/s 76% -- -32% -37% -59%
rhesa 28259/s 159% 47% -- -7% -39%
rhesa2 30244/s 177% 58% 7% -- -35%
roboticus 46412/s 325% 142% 64% 53% --
source: horse:cow:dog:cat:mouse
Rate ruzam rhesa2 rhesa ikegami3 roboticus
ruzam 1855/s -- -67% -68% -72% -82%
rhesa2 5676/s 206% -- -2% -14% -45%
rhesa 5781/s 212% 2% -- -13% -44%
ikegami3 6614/s 257% 17% 14% -- -36%
roboticus 10353/s 458% 82% 79% 57% --
source: horse:cow:dog:cat:rat:mouse
Rate ruzam rhesa2 rhesa ikegami3 roboticus
ruzam 799/s -- -70% -70% -77% -85%
rhesa2 2659/s 233% -- -0% -24% -50%
rhesa 2667/s 234% 0% -- -24% -50%
ikegami3 3521/s 340% 32% 32% -- -34%
roboticus 5371/s 572% 102% 101% 53% --
| [reply] [d/l] |
Re: possible combinations in sequence
by QM (Parson) on Jun 13, 2006 at 23:40 UTC
|
And for completeness, the glob solution, which spends more code fixing up the output than actually generating the results:
sub qm {
my ($glob) = @_;
my @rv;
$glob =~ s/(\w+)/{$1,}/g;
for my $combo (glob($glob))
{
$combo =~ s/^:+//;
next unless length($combo);
push @rv, join ':', split /:+/, $combo;
}
return @rv;
}
and is horribly slow as well.
BTW, it's interesting to note the change in benchmark results when the input is a long list of null strings:
$source = ':'x20;
-QM
--
Quantum Mechanics: The dreams stuff is made of
| [reply] [d/l] [select] |