ezekiel has asked for the wisdom of the Perl Monks concerning the following question:
I have an array of arbitrary length. I want to get all possible sub-arrays of a certain length. For example,
@my_array = (0, 1, 2, 3);
$my_sub_array_length = 2;
# possible combinations are
(0, 1), (0, 2), (0, 3), (1, 2), (1, 3), (2, 3)
A search of the site lead me to this module which looks like it might do what I want but the links therein to pod and other documentation seem to be no longer active.
Does anyone know of new links to the documentation for the module? or does anyone have any other suggestions for doing these combinatorics?
Thanks.
(tye)Re: Combinatorics
by tye (Sage) on Aug 22, 2002 at 03:10 UTC
|
sub genFixedSubsets
{
my( $size, @set )= @_;
my @idx= reverse 0..$size-1;
return sub {
return if $size < @idx;
my @ret= @set[@idx];
my $i= 0;
$i++ until ++$idx[$i] < @set-$i || $size < $i;
$idx[$i]= 1+$idx[1+$i] while 0 <= --$i;
return @ret;
};
}
my $gen= genFixedSubsets( $ARGV[0] || 3, 1..($ARGV[1]||5) );
my @subset;
while( @subset= $gen->() ) {
print "@subset\n";
}
For example:
$ subsets 3 5
3 2 1
4 2 1
5 2 1
4 3 1
5 3 1
5 4 1
4 3 2
5 3 2
5 4 2
5 4 3
- tye (but my friends call me "Tye") | [reply] [d/l] [select] |
Re: Combinatorics
by Aristotle (Chancellor) on Aug 22, 2002 at 03:23 UTC
|
#!/usr/bin/perl -w
use strict;
use Data::Dumper;
sub unshift_many {
my $scalar = shift;
unshift @$_, $scalar for @_;
@_;
}
sub combinations {
my ( $array, $len, $start ) = @_;
$start ||= 0;
return unless $len > 0;
return $len == 1
? map [ $_ ], @{ $array }[ $start .. $#$array ]
: map unshift_many( $array->[$_], combinations( $array, $len-1
+, $_+1 ) ),
$start .. $#$array;
}
$Data::Dumper::Indent=0;
print Dumper( [ combinations [ qw( 1 2 3 4 5 ) ], 3 ] ), "\n";
Makeshifts last the longest. | [reply] [d/l] |
Re: Combinatorics
by dpuu (Chaplain) on Aug 22, 2002 at 02:43 UTC
|
I don't know about the module, but its a fairly simple (recursive) subroutine:
use Data::Dumper;
my @a = (1..4);
my $len = 4;
print Dumper(combinations($len, @a));
sub combinations
{
my ($size, @elements) = @_;
return [] if $size < 1;
my @result = ();
foreach my $elem (@elements)
{
push @result,
map { [$elem, @$_] }
combinations($size-1,
grep { $_ != $elem }
@elements)
}
return @result;
}
--Dave
Update: for greater generality, replace foreach loop with
my @seen = ();
while (@elements)
{
my $elem = shift @elements;
push @result,
map { [$elem, @$_] }
combinations($size-1, @seen, @elements);
push @seen, $elem;
}
| [reply] [d/l] [select] |
|
This looks great! except that it gives permutations rather than combinations. For example, it produces both (1, 2, 3) and (1, 3, 2) whereas, for my purposes, these are the same thing i.e., order is not important. It gives me a starting point though - thanks!
| [reply] |
Re: Combinatorics
by jryan (Vicar) on Aug 22, 2002 at 03:05 UTC
|
Set "$set_group_size" to the size you want the chunks to be.
use Data::Dumper;
my $set_group_size = 2;
my @list = 1..5;
print Dumper ([ sumList($set_group_size,@list) ]);
sub sumList {
my @sumlist;
my $size = shift;
while ( @_ ) {
my @current = splice @_, 0, $size-1;
foreach my $item (@_) {
push ( @sumlist, [@current,$item] );
}
unshift (@_, @current[1..$#current]);
}
return @sumlist;
}
| [reply] [d/l] |
Re: Combinatorics
by DamnDirtyApe (Curate) on Aug 22, 2002 at 05:30 UTC
|
#! /usr/bin/perl -w
use strict ;
use Data::Dumper ;
$|++ ;
my @my_array = (0, 1, 2, 3) ;
my $my_sub_array_length = 2 ;
my @subsets = () ;
for ( 3 .. 2 ** @my_array ) {
my @digits = reverse split // => sprintf "%b", $_ ;
if ( ( grep { $_ } @digits ) == $my_sub_array_length ) {
my @sub_arr = grep { $digits[$_] } @my_array ;
push @subsets, \@sub_arr ;
}
}
print Dumper( \@subsets ) ;
exit ;
__END__
_______________
DamnDirtyApe
Those who know that they are profound strive for clarity. Those who
would like to seem profound to the crowd strive for obscurity.
--Friedrich Nietzsche
| [reply] [d/l] |
Re: Combinatorics
by bart (Canon) on Aug 22, 2002 at 09:12 UTC
|
Ah, a twist. The common question is about permutations.
Nevertheless, I'd still use recursion. Say you need $n elements. Is the first element included? If yes, combine it with all combinations of $n-1 elements from the list of all of the following elements. If no, get all combinations of $n elements of the same sublist. Of course, you need to include every possible solution, which means walking every possible path.
Code!
sub combinations {
my $n = shift;
return [@_] if $n == @_;
return () if $n > @_ or $n < 0;
my $first = shift;
my @r = ((map [ $first, @$_ ], combinations($n-1, @_)),
combinations($n, @_));
return @r;
}
use Data::Dumper;
print Dumper [ combinations(2, (0, 1, 2, 3)) ];
It appears to be working well.
Update:
I'm pretty sure inserting
return [] if $n == 0;
at the appropriate place, i.e. among the other return statements near the top, will improve efficiency quite a bit. It avoids doing a lot of useless recursion if all you want is the empty list as a singleton.
| [reply] [d/l] [select] |
|
| [reply] [d/l] |
Re: Combinatorics
by BrowserUk (Patriarch) on Aug 22, 2002 at 19:09 UTC
|
Update:Having heard back from the author of the original C-source of this algorithm, he asked me to change the email address to his new one.
Not as concise as most of the others, but it seems fairly efficient for both memory and speed.
It probably could be golfed some more, but it defeated my attempts so far.
If your application calls for sub-setting different sets, but with the same size of both subset and set, you can generate the subsets of indices to the sets, rather that the subsets themselves and reuse the indices.
#! perl -w
use strict;
=pod
Bit-twiddling transpositional combination generator in Perl,
© 2002,BrowserUK / perlmonks.com
Based upon a C implementation by Doug Moore (unkadoug@yahoo.com).
Source:http://www.caam.rice.edu/~dougm/twiddle/yargbitcomb.c
=cut
sub Lshift1
{
use integer;
my $i = shift;
my $ii = $i>>1;
return 1 << $ii << ($i - $ii);
}
sub yargFirstComb
# Returns the inverse gray code (yarg) of the first combination of k i
+tems (i.e. {0,1,..,k-1})
{
use integer;
my $kk = Lshift1($_[0])-1;
return $kk ^ $kk/3;
};
sub leastItem
# Returns the least item in a combination (i. e. leastItem({2,4,5}) ==
+ {2}
{ use integer; return $_[0] & -$_[0]; };
sub yargLastComb
# Returns the yarg of the last combination of k items from n (i.e. {n-
+k,..,n-1})
{
use integer;
my ($nn, $kk) = ( Lshift1($_[0])-1, Lshift1($_[1])-1);
return ($_[1]) ? $nn ^ ($kk/3) : 0;
};
# Returns the yarg of the next combination after yarg input
sub yargNextComb {
use integer;
my $comb = shift;
my $grey = ($comb << 1) ^ $comb;
my $i = 2;
my $candidateBits;
do {
my $y = ($comb & ~($i - 1)) + $i;
my $j = leastItem( $y ) << 1;
my $h = !!($y & $j);
$candidateBits = (($j - $h) ^ $grey) & ( $j - $i );
$i = $j;
} while (!$candidateBits);
return $comb + leastItem($candidateBits);
}
sub factorial { no integer; my ($f,$n) = (1,shift); $f *= $n-- while(
+$n ); return $f; }
sub subsets {
use integer;
my @AoAoCombs;
my ($k, $n, $combs) = (shift, shift, 0);
{
no integer;
$combs = factorial($n)/(factorial($k)*factorial($n-$k));
print "Generating $combs subsets of $k from a set of $n\n";
$#AoAoCombs = $combs-1; #pre-extend t
+he array of array refs to its final size
}
die "Usage: subsets k, n\nGenerate subsets of k-elements from a se
+t of n-elements where k < n.\n"
unless $n and $k and $k < $n;
my $comb = yargFirstComb($k);
my $lastcomb = yargLastComb( $n, $k);
while(1) {
my $member = 0; #!!
my $c = $comb ^ ($comb >> 1);
# 'push' anon array ref & pre-extend anon. array space
$AoAoCombs[--$combs] = [];
$#{$AoAoCombs[$combs]} = $k-1;
($c & Lshift1($_)) and @{$AoAoCombs[$combs]}[$member++] = $_ f
+or 0 .. $n-1; # 'unshift'
last if $comb == $lastcomb;
$comb = yargNextComb($comb);
}
return \@AoAoCombs;
}
my $AoAoCombs = subsets 2, 4; # Generate combinations of indices
my @data1 = qw( just another perl hacker );
local $,=' ';
print "Applying combined indices to @data1\n\n";
print @data1[ @{$AoAoCombs->[$_]} ], $/ for (0 .. $#$AoAoCombs);
my @data2= (1,2,3,4);
print "\nApplying combined indices to @data2\n\n";
print @data2[ @{$AoAoCombs->[$_]} ], $/ for (0 .. $#$AoAoCombs); # App
+ly the indices to as many sets as you like
print $/;
no integer;
my @data3 = (1..31);
my @times = (times);
my $start = $times[0] + $times[1];
$AoAoCombs = subsets 26, ~~@data3;
@times = times;
my $end = $times[0]+$times[1];
print "Generating " . @{$AoAoCombs} . " combinations of 26 from 31 too
+k ", $end-$start, " seconds of cpu P-II\@233MHz\n",
"including generating 169911 x 26 element anonymous arrays to stor
+e the results.\n";
#print "\nApplying combined indices to @data1\n\n";
#print @data1[@{$^AoAoCombs[$_]}], $/ for (0..$#{$AoAoCombs});
+ # Apply the indices
__END__
# Output
C:\test>191902
Generating 6 subsets of 2 from a set of 4
Applying combined indices to just another perl hacker
just hacker
another hacker
perl hacker
just perl
another perl
just another
Applying combined indices to 1 2 3 4
1 4
2 4
3 4
1 3
2 3
1 2
Generating 169911 subsets of 26 from a set of 31
Generating 169911 combinations of 26 from 31 took 251.51 seconds of
+cpu P-II@233MHz
including generating 169911 x 26 element anonymous arrays to store th
+e results.
C:\test>
What's this about a "crooked mitre"? I'm good at woodwork! | [reply] [d/l] |
Re: Combinatorics
by blakem (Monsignor) on Aug 22, 2002 at 21:43 UTC
|
A little afternoon golf produced this scary looking solution... It assumes the elements in the array are unique (i.e. a set) and none of them contain the comma character.
#!/usr/bin/perl -wT
use strict;
my @a = 1..5;
my $l = 3;
local $" = ',';
my @combos = grep!$;{"@$_"}++,map[sortsplit','],
grep!/([^,]+).*,\1,/,glob"{@a},"x$l;
print "@$_\n" for @combos;
__END__
1,2,3
1,2,4
1,2,5
1,3,4
1,3,5
1,4,5
2,3,4
2,3,5
2,4,5
3,4,5
Update: There is a subtle bug in the second line of the golfed statement... grep!/([^,]+).*,\1,/ is supposed to filter out all permutations that have doubled elements, but its a bit faulty. It doesn't affect the final value of @combos though. Anyone want to take a guess at it? Whats the bug, and why doesn't it matter in the end?
-Blake
| [reply] [d/l] [select] |
Re: Combinatorics
by I0 (Priest) on Aug 23, 2002 at 05:03 UTC
|
use Data::Dumper;
my @my_array = (1..7);
my $my_sub_array_length = 4;
print Dumper(combinations($my_sub_array_length, @my_array));
sub combinations {
my($len,@a)=@_;
return
map{ my $c=$_<<1; [grep{($c>>=1)&1}@a]}
&{sub{
my @ret = ();
my $x;
for($_=(1<<shift)-1;
($x=$_)<1<<$_[0];
$x&=~$x>>1,$x&=-$x,$_+=$x--,($x&=$_)?($_-=$x,$_+=$x/($
+x&-$x)):0
){ push @ret,$_ }
@ret;
}}($len,0+@a);
}
| [reply] [d/l] |
A non-recursive solution
by Thelonius (Priest) on Aug 23, 2002 at 15:43 UTC
|
#!perl -w
use strict;
# kenhirsch at myself.com 2002-08-23
my $r = shift or die "usage: combinations r a b c d e ...\n";
my @out = combinations($r, \@ARGV);
for (@out) {
print join(" ", @{$_}), "\n";
}
# From Algorith L in Knuth Vol. 4 Sec 7.2.1.3 (not yet published)
sub combinations {
my ($t, $arrayref) = @_;
my @c = 0 .. $t-1;
my @range = reverse @c;
my $j;
my @result;
$c[$t] = scalar(@{$arrayref});
$c[$t + 1] = 0;
do {
push @result, [@{$arrayref}[@c[@range]]];
for ($j=0; $c[$j] + 1 == $c[$j+1]; $j++) {
$c[$j] = $j;
}
$c[$j]++;
} while ($j < $t);
return @result;
}
| [reply] [d/l] |
Re: Combinatorics
by jackdied (Monk) on Aug 25, 2002 at 07:51 UTC
|
Consider this a plug,
http://probstat.sourceforge.net Is my combination/permutation/cartesian back-of-the-envelope algos written in C with python bindings. I'm currently in the process of upgrading the algos and adding more python functionality (slices).
I've been meaning to add PerlXS bindings, but haven't had the time to learn anything complicated in XS. If someone can write an XS interface for one of the objects, I can fake it for the rest.
The license is GPL, oddly enough I'm working on it at this moment (pulling from http://sources.redhat.com/gsl/ the Gnu Scientific library for better C algos where I can). If you want to see what this perl code I wrote Name Me! MixMatch? looks like translated from C to perl, check it out.
| [reply] |
|
If someone can write an XS interface for one of the objects, I can fake it for the rest.
Don't. Take a look at Inline::C or Inline::C++. You'll find that it's a lot easier than you think.
Greetings,
Christian
| [reply] |
|
|