Beefy Boxes and Bandwidth Generously Provided by pair Networks DiBona
Don't ask to ask, just ask
 
PerlMonks  

Finding all Combinations

by narse (Pilgrim)
on Nov 29, 2001 at 09:23 UTC ( #128286=perlquestion: print w/ replies, xml ) Need Help??
narse has asked for the wisdom of the Perl Monks concerning the following question:

I am workong on a school assignment that involves testing all the different combinations of array elements in @speeds. Currently I am using this code to do that:
use strict; use Math::monthly::CarClumps qw\count\; my ( @order , @clumps, $total , @speeds ); # speed of cars @speeds = ( 50 .. 59 ); # initialize @clumps as {0} @clumps = (0) x ( @speeds + 1 ); clumps ( @speeds ); foreach my $i ( 1 .. $#clumps ) { print $i + $speeds[0] - 1, " :\t$clumps[$i]\t( ", ( $clumps[$i] * 100 / $total ), "% )\n" } print "total :\t$total ( = ", scalar @speeds, "! )\n"; exit 0; sub clumps { my @cars = @_; if ( scalar @cars eq 0 ) { $clumps[count(@order)]++; $total++ } else { foreach ( 0 .. $#cars) { my $speed = pop @cars; unshift @order, $speed; clumps( @cars ); unshift @cars, $speed; } } shift @order; return 0; }
The clumps() sub uses a tree recursion model which is really slow (particularly on a 90mhz box). Has anyone had to do a similar thing and can suggest a better model for testing all the combinations of @speeds?

Comment on Finding all Combinations
Download Code
(tye)Re: Finding all Combinations
by tye (Cardinal) on Nov 29, 2001 at 10:02 UTC

    Sure. No need to do any recursion here. Just count in binary. Here is a version that creates a closure (an anonymous subroutine that holds the needed data inside of itself -- sort of like a tiny "object") that returns the next combination each time it is called:

    sub combinations { my @list= @_; my @pick= (0) x @list; return sub { my $i= 0; while( 1 < ++$pick[$i] ) { $pick[$i]= 0; return if $#pick < ++$i; } return @list[ grep $pick[$_], 0..$#pick ]; }; } my $next= combinations( 50..59 ); my @comb; while( @comb= $next->() ) { # do work with @comb here } # Note that the empty set is a valid combination but is # the last combination returned which also indicates "no # more combinations left. So the above loop doesn't bother # processing the empty list. If you want to process the # empty set, then use: my @comb; do { # do work with @comb here } while( @comb= $next->() );

    Update: My code finds combinations but the original code finds permutations even though the author asked for combinations. (See (tye)Re: Permutations if you don't know the difference between the two.)

    Of course, my favorite way of finding permutations is Permuting with duplicates and no memory.

            - tye (but my friends call me "Tye")
      I don't understand that algorithm at all. It looks like it's just setting stuff to true so that you use it?

      ------
      We are the carpenters and bricklayers of the Information Age.

      Don't go borrowing trouble. For programmers, this means Worry only about what you need to implement.

      That's fantastic - I was wondering if you could perhaps help out an initiate such as myself and comment that code - what it's doing and why, as I'm having a bit of a hard time following it and would like to learn the methodology behind it (as i believe i was trying to domsething similar but failed completely). Thanks!
        sub combinations { my @list= @_; # List of items to choose from my @pick= (0) x @list; # Whether we want each item # $pick[$i] means include $list[$i] in results. # So @pick currently describes the empty subset. # Return a closure that, each time it is called, returns # the next subset: return sub { # Treat @pick as a base-2 number and increment it. # Note that @pick started as all 0s and we stop # after it is all 1s so all cases get covered. # (See original node for handling the empty subset) # Start at least-significant bit, $pick[0]: my $i= 0; # Increment a bit. If the bit was already 1, then # set it to 0 and continue to next bit: while( 1 < ++$pick[$i] ) { $pick[$i]= 0; # If we've run out of bits, then we were at # all 1s and so are done. Return empty list: return if $#pick < ++$i; } # The grep() below returns the indices for which # $pick[$_] is not 0. The @list[...] is an array # slice that returns the list of elements of @list # at the indices returned by grep. That is, we # return all items $list[$i] where $pick[$i] is # not 0. Same as: # map { $pick[$_] ? $list[$_] : () } 0..$#list; return @list[ grep $pick[$_], 0..$#pick ]; }; } my $next= combinations( 50..59 ); my @comb; while( @comb= $next->() ) { # do work with @comb here }

        Does that help?

                - tye (but my friends call me "Tye")
      tye,
      I expanded the code I came up with for combinations of a fixed size:
      #!/usr/bin/perl use strict; use warnings; my $iter = combo( 30..50 ); while ( my @combo = $iter->() ) { print "@combo\n"; } sub combo { my @list = @_; return sub { () } if ! @_; my (@position, @stop, $end_pos, $done); my ($by, $next) = (0, 1); return sub { return () if $done; if ( $next ) { $by++; return () if $by > @list; @position = (0 .. $by - 2, $by - 2); @stop = @list - $by .. $#list; $end_pos = $#position; $next = undef; } my $cur = $end_pos; { if ( ++$position[ $cur ] > $stop[ $cur ] ) { $position[ --$cur ]++; redo if $position[ $cur ] > $stop[ $cur ]; my $new_pos = $position[ $cur ]; @position[ $cur .. $end_pos ] = $new_pos .. $new_pos + + $by; } } if ( $position[0] == $stop[0] ) { $position[0] == @list ? $done = 1 : $next = 1; } return @list[ @position ]; } }
      I haven't analyzed its O factor or spent a lot of time with good benchmarks, but it does appear to be considerably faster for combinations of 30..50.

      Cheers - L~R

        I don't think this is really called "combinations" because the number of elements to select isn't specified beforehand. It is really finding all possible subsets, which is the same as finding the power set.

        Another approach to it uses Algorithm::Loops and is very simple; (well, if you understand nesting loops). You loop over 0..$#set finding the first element of the subset, then loop over the next element of the subset ($_+1..$#set), etc:

        use Algorithm::Loops qw( NestedLoops ); sub powerSetGen2 { my $end= shift(@_) - 1; return NestedLoops( [ [ 0..$end ], ( sub { [ $_+1 .. $end ] } ) x $end, ], { OnlyWhen => 1, }, ); } my $size= @ARGV ? shift(@ARGV) : 40; my @set= 1..$size; $|= 1; my $start= time(); my $iter= powerSetGen2( $size ); my @subSet= (); my $count= 0; do { $count++; print "( @subSet )$/" if @ARGV; } while( @subSet= @set[ $iter->() ] ); print "$count subsets for $size in ", time()-$start, " secs.$/";

        Then you can implement this same approach directly (without using the module):

        sub powerSetGen3 { my $end= shift(@_) - 1; my @idx; return sub { if( ! @idx ) { push @idx, 0; } elsif( $idx[-1] < $end ) { push @idx, 1+$idx[-1]; } else { pop @idx; $idx[-1]++ if @idx; } return @idx; }; }

        And this code is so very simple, that I'm at a loss to explain why Limbic~Region's code is a little faster for large sets. His code goes about finding the subsets in a quite different order (and skips one subset) but the routines get called the same number of times and it appears to me that Limbic~Region's would do more work in an average call; but my benchmarks say that I'm wrong.

        - tye        

        Limbic~Region:

        I've finally got it! Thanks for the help you posted on your scratchpad. After a few hours of study, it finally paid off. I've commented it to describe how it works, and made a few changes to fix a minor bug, and remove some code that is never executed, and removed a state variable:

        #------------------------------------------------------------ # Return an iterator of all possible combinations (of all # lengths) of a set of symbols with the constraint that each # symbol in each result is less than the symbol to its right. # sub combo { # The symbols we draw our results from: my @list = @_; # The trivial case return sub { ( ) } if ! @_; # Persistent state for the closure my (@position, # Last set of symbol indices generated @stop); # Last set possible for $by symbols # Start by telling iterator that it just finished # (next=1) all results of 0 digits. my ($by, $next) = (0, 1); return sub {
        # We're done after we've returned a list of all symbols return () if @position == @list;
        if ( $next ) { # We finished all combos of size $by, now do $by+1 $by++;
        # If new size is larger than list, we're done! return () if $by > @list;
        # Start with leftmost $by symbols (except last, # which is preincremented before use) @position = (0 .. $by - 2, $by - 2); # Our stop condition is when we've returned the # rightmost $by symbols @stop = @list - $by .. $#list; $next = undef; } # Start by trying to advance the rightmost digit my $cur = $#position; { # **** redo comes back here! **** # Advance current digit to next symbol if ( ++$position[ $cur ] > $stop[ $cur ] ) { # Keep trying next-most rightmost digit # until we find one that's not 'stopped' $position[ --$cur ]++; redo if $position[ $cur ] > $stop[ $cur ]; # Reset digits to right of current digit to # the leftmost possible positions my $new_pos = $position[ $cur ]; @position[$cur .. $#position] = $new_pos .. $new_pos+$ +by; } } # Advance to next result size when we return last # possible result of this size $next = $position[0]==$stop[0]; return @list[ @position ]; } }
        Thanks again! I learned a lot from this exercise.

        UPDATE: I just tweaked the code a bit to make it check for done less frequently so it'll run a bit quicker. It munges up the code listing a bit though. Is there a better way to edit the code so it's obvious without interspersing download links?

        --roboticus

Re: Finding all Combinations
by Masem (Monsignor) on Nov 29, 2001 at 16:59 UTC
    Sounds familiar... though the code offered there might not be great for a school assignment, but there are other solutions listed as well.

    -----------------------------------------------------
    Dr. Michael K. Neylon - mneylon-pm@masemware.com || "You've left the lens cap of your mind on again, Pinky" - The Brain
    "I can see my house from here!"
    It's not what you know, but knowing how to find it if you don't know that's important

      You link to a discussion of permutations but the original poster requested combinations and it appears to me that the code here does find combinations (though I didn't fully digest the code so I could be wrong). Update: But actually running the code (modified so that I don't need the mentioned module) shows that it is finding permutations. *sigh* (:

      Confusion between these two is not uncommon. See (tye)Re: Permutations for another instance and a bit of explanation of how to tell the two apart.

              - tye (but my friends call me "Tye")
        Thanks for the input, and yes my misguidedness did mean to say permutations. I tried your permutations sub and that did speed up the program a small ammount. It turns out that Algorithm::Permute is much faster than them both so I'm using that now instead. Thanks.
Reaped: Re: Finding all Combinations
by NodeReaper (Curate) on Dec 24, 2010 at 10:08 UTC

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://128286]
Approved by root
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others avoiding work at the Monastery: (8)
As of 2014-04-24 11:53 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    April first is:







    Results (565 votes), past polls