good chemistry is complicated,and a little bit messy -LW PerlMonks

### Re^2: Finding all Combinations

by Limbic~Region (Chancellor)
 on Sep 27, 2004 at 14:06 UTC ( #394168=note: print w/replies, xml ) Need Help??

in reply to (tye)Re: Finding all Combinations

tye,
I expanded the code I came up with for combinations of a fixed size:
```#!/usr/bin/perl
use strict;
use warnings;

my \$iter = combo( 30..50 );
while ( my @combo = \$iter->() ) {
print "@combo\n";
}

sub combo {
my @list = @_;
return sub { () } if ! @_;

my (@position, @stop, \$end_pos, \$done);
my (\$by, \$next) = (0, 1);

return sub {
return () if \$done;
if ( \$next ) {
\$by++;
return () if \$by > @list;
@position = (0 .. \$by - 2, \$by - 2);
@stop     = @list - \$by .. \$#list;
\$end_pos  = \$#position;
\$next = undef;
}
my \$cur = \$end_pos;
{
if ( ++\$position[ \$cur ] > \$stop[ \$cur ] ) {
\$position[ --\$cur ]++;
redo if \$position[ \$cur ] > \$stop[ \$cur ];
my \$new_pos = \$position[ \$cur ];
@position[ \$cur .. \$end_pos ] = \$new_pos .. \$new_pos +
+ \$by;
}
}
if ( \$position[0] == \$stop[0] ) {
\$position[0] == @list ? \$done = 1 : \$next = 1;
}
return @list[ @position ];
}
}
I haven't analyzed its O factor or spent a lot of time with good benchmarks, but it does appear to be considerably faster for combinations of 30..50.

Cheers - L~R

Replies are listed 'Best First'.
Re^3: Finding all Combinations (subsets)
by tye (Sage) on Oct 01, 2004 at 20:49 UTC

I don't think this is really called "combinations" because the number of elements to select isn't specified beforehand. It is really finding all possible subsets, which is the same as finding the power set.

Another approach to it uses Algorithm::Loops and is very simple; (well, if you understand nesting loops). You loop over 0..\$#set finding the first element of the subset, then loop over the next element of the subset (\$_+1..\$#set), etc:

```use Algorithm::Loops qw( NestedLoops );

sub powerSetGen2 {
my \$end= shift(@_) - 1;
return NestedLoops(
[
[ 0..\$end ],
( sub {
[ \$_+1 .. \$end ]
} ) x \$end,
],
{
OnlyWhen => 1,
},
);
}

my \$size= @ARGV ? shift(@ARGV) : 40;
my @set= 1..\$size;
\$|= 1;

my \$start= time();
my \$iter= powerSetGen2( \$size );
my @subSet= ();
my \$count= 0;
do {
\$count++;
print "( @subSet )\$/"
if  @ARGV;
} while(  @subSet= @set[ \$iter->() ]  );
print "\$count subsets for \$size in ", time()-\$start, " secs.\$/";

Then you can implement this same approach directly (without using the module):

```sub powerSetGen3 {
my \$end= shift(@_) - 1;
my @idx;
return sub {
if(  ! @idx  ) {
push @idx, 0;
} elsif(  \$idx[-1] < \$end  ) {
push @idx, 1+\$idx[-1];
} else {
pop @idx;
\$idx[-1]++   if  @idx;
}
return @idx;
};
}

And this code is so very simple, that I'm at a loss to explain why Limbic~Region's code is a little faster for large sets. His code goes about finding the subsets in a quite different order (and skips one subset) but the routines get called the same number of times and it appears to me that Limbic~Region's would do more work in an average call; but my benchmarks say that I'm wrong.

- tye

Re^3: Finding all Combinations
by roboticus (Chancellor) on Jun 23, 2006 at 14:21 UTC
Limbic~Region:

I've finally got it! Thanks for the help you posted on your scratchpad. After a few hours of study, it finally paid off. I've commented it to describe how it works, and made a few changes to fix a minor bug, and remove some code that is never executed, and removed a state variable:

```#------------------------------------------------------------
# Return an iterator of all possible combinations (of all
# lengths) of a set of symbols with the constraint that each
# symbol in each result is less than the symbol to its right.
#
sub combo {

# The symbols we draw our results from:
my @list = @_;

# The trivial case
return sub { ( ) } if ! @_;

# Persistent state for the closure
my (@position, # Last set of symbol indices generated
@stop);    # Last set possible for \$by symbols

# Start by telling iterator that it just finished
# (next=1) all results of 0 digits.
my (\$by, \$next) = (0, 1);

return sub {
```        # We're done after we've returned a list of all symbols
return () if @position == @list;
```        if ( \$next ) {
# We finished all combos of size \$by, now do \$by+1
\$by++;
```            # If new size is larger than list, we're done!
return () if \$by > @list;
```
# which is preincremented before use)
@position = (0 .. \$by - 2, \$by - 2);

# Our stop condition is when we've returned the
# rightmost \$by symbols
@stop = @list - \$by .. \$#list;

\$next = undef;
}

# Start by trying to advance the rightmost digit
my \$cur = \$#position;
{   # **** redo comes back here! ****

# Advance current digit to next symbol
if ( ++\$position[ \$cur ] > \$stop[ \$cur ] ) {

# Keep trying next-most rightmost digit
# until we find one that's not 'stopped'
\$position[ --\$cur ]++;
redo if \$position[ \$cur ] > \$stop[ \$cur ];

# Reset digits to right of current digit to
# the leftmost possible positions
my \$new_pos = \$position[ \$cur ];
@position[\$cur .. \$#position] = \$new_pos .. \$new_pos+\$
+by;
}
}

# Advance to next result size when we return last
# possible result of this size
\$next = \$position[0]==\$stop[0];

return @list[ @position ];
}
}
Thanks again! I learned a lot from this exercise.

UPDATE: I just tweaked the code a bit to make it check for done less frequently so it'll run a bit quicker. It munges up the code listing a bit though. Is there a better way to edit the code so it's obvious without interspersing download links?

--roboticus

roboticus,
I am glad that you were able to decipher given my description of how the algorithm works. I am interested in knowing what was the minor bug and what code was never executed? If you found this interesting, might I suggest you move on to How A Function Becomes Higher Order and my iterator tutorial.

Cheers - L~R

Limbic~Region:

The minor bug is that this statement:

```    \$position[0] == @list ? \$done = 1 : \$next = 1;
never sets the \$done flag. I think you meant:

```    @position == @list ? \$done = 1 : \$next = 1;
And because \$done is never set,

```        return () if \$done;
never executes. It's a truly minor bug though, because you have another bit of code that detects the end condition:

```            return () if \$by > @list;
So I just removed \$done entirely. (Actually, looking at my code, when I was rearranging, I removed that last condition. I should've kept it because my code tests for the end condition on every call, where your terminator checks only when it's time to go to the next size of results. I'll update my node to reflect that.)

The other thing I did was remove \$end_pos since its value has such a limited lifetime, and add a passel of comments so I wouldn't forget how it works!

Now off to How A Function Becomes Higher Order and yer iterator toot!

--roboticus

Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://394168]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others rifling through the Monastery: (3)
As of 2023-12-04 20:09 GMT
Sections?
Information?
Find Nodes?
Leftovers?
Voting Booth?
What's your preferred 'use VERSION' for new CPAN modules in 2023?

Results (25 votes). Check out past polls.

Notices?