Recursive algorithms are often simple and intuitive. Unfortunately, they are also often explosive in terms of memory and execution time required. Take, for example, the NchooseM algorithm:
# Given a list of M items and a number N,
# generate all sizeN subsets of M
sub choose_n {
my $n = pop;
# Base cases
return [] if $n == 0 or $n > @_;
return [@_] if $n == @_;
# otherwise..
my ($first, @rest) = @_;
# combine $first with all N1 combinations of @rest,
# and generate all Nsized combinations of @rest
my @include_combos = choose_n(@rest, $n1);
my @exclude_combos = choose_n(@rest, $n);
return ( (map {[$first, @$_]} @include_combos)
, @exclude_combos );
}
Great, as long as you don't want to generate all 10element subsets of a 20item list. Or 45choose20. In those cases, you will need an iterator. Unfortunately, iteration algorithms are generally completely unlike the recursive ones they mimic. They tend to be a lot trickier.
But they don't have to be. You can often write iterators that look like their recursive counterparts — they even include recursive calls — but they don't suffer from explosive growth. That is, they'll still take a long time to get through a billion combinations, but they'll start returning them to you right away, and they won't eat up all your memory.
The trick is to create iterators to use in place of your recursive calls, then do a little justintime placement of those iterator creations.
So let's take a first stab at choose_n. First, our base cases are going to be subs that return whatever they were returning before, but after returning those values once, they don't return anything anymore:
sub iter_choose_n {
my $n = pop;
# Base cases
my $once = 0;
return sub {$once++ ? () : []} if $n == 0 or $n > @_;
my ($first, @rest) = @_;
return sub {$once++ ? () : [$first, @rest]} if $n == @_;
Apart from the iterator trappings, we've got essentially what we had before. Converting the map into an iterator involves some similar work, but the parallels are still pretty obvious. We exhaust the first iterator before turning to the second:
# otherwise..
my $include_iter = iter_choose_n(@rest, $n1);
my $exclude_iter = iter_choose_n(@rest, $n);
return sub {
if (my $set = $include_iter>()) {
return [$first, @$set];
}
else {
return $exclude_iter>();
}
}
We now have a recursivelydefined iterator that wasn't a heck of a lot more complex than our original algorithm. That's the good news. The bad news is: it's still doubly recursive, O(2^N) in space and time, and so will take a long time to start generating data. Time for a little trick. Because we don't use $exclude_iter until we've exhausted $include_iter, we can delay defining it:
# otherwise..
my $include_iter = iter_choose_n(@rest, $n1);
my $exclude_iter;
return sub {
if (my $set = $include_iter>()) {
return [$first, @$set];
}
else {
$exclude_iter = iter_choose_n(@rest, $n);
return $exclude_iter>();
}
}
}
Now our code is singly recursive, O(N) in space and time to generate an iterator, and that makes a big difference. Big enough that you probably won't need to go to the trouble of coming up with an O(1) truly iterative solution.
Of course, if you complete the iterations, eventually you will have generated those 2^N subs, and they'll clog up your memory. You may not be concerned about that (you may not be expecting to perform all that many iterations), but if you are, you can put a little code in to free up exhausted iterators:
# otherwise..
my $include_iter = iter_choose_n(@rest, $n1);
my $exclude_iter;
return sub {
if ($include_iter and my $set = $include_iter>()) {
return [$first, @$set];
}
else {
if ($include_iter) {
undef $include_iter;
$exclude_iter = iter_choose_n(@rest, $n);
}
return $exclude_iter>();
}
}
}
Update: See my response to Kelan for a recipe for conversion from recursion to iterator.
Caution: Contents may have been coded under pressure.
Re: Recursivelygenerated Iterators
by blokhead (Monsignor) on May 19, 2005 at 13:09 UTC

Quite an interesting technique, and as a disclaimer I really do like it. That being said, this technique will not always get you the fastest iterator. The holy grail of iterators for combinatorial structures is a constant amortized time (CAT) algorithm, meaning that any N successive calls to the iterator takes O(N) time. This is as good as it gets.
Your technique is great as the iterator algorithm mirrors the easytounderstand recursive algorithm. And kudos for noting (and fixing) the doublerecursion. However, your algorithm's running time for iterating through all (N choose K) combinations is proportional to (N+2 choose K+1), so it is not CAT. For a detailed analysis of this and many other interesting iteration algorithms, see section 4.3 of the book Combinatorial Generation by Frank Ruskey. The book does give a CAT iterator for generating combinations, and I believe your general technique of mirroring the recursive algorithm inherently cannot achieve CAT running time (unless you memoize, which defeats the whole purpose of iterating to save memory).
I guess it's a matter of what you want to trade  (sometimes small) factors of running time, or readability of the algorithm. Although for iteration problems like this, I usually just refer to Ruskey's book and find a prepackaged CAT algorithm ;)
 [reply] 

 [reply] 
Re: Recursivelygenerated Iterators
by spurperl (Priest) on May 19, 2005 at 05:18 UTC

Nice ++
I wonder how general your technique is  I mean is it applicable only the to computation you present, or is it generally good for all memoryeating recursive computations ? Can one formulate a general algorithm for this transformation ?
By the way, another technique to ease these computations is Memoization. It is almost always very helpful for recursive algorithms like the one you present.  [reply] 

 [reply] 
Re: Recursivelygenerated Iterators
by kelan (Deacon) on May 19, 2005 at 21:01 UTC

sub choose_n_iter {
my @todo = [ shift, [ @_ ], [] ];
return sub {
while ( @todo ) {
my ( $n, $pool, $tally ) = @{ shift @todo };
return $tally if $n == 0;
next if @$pool == 0;
my ( $first, @rest ) = @$pool;
push @todo, [ $n  1, \@rest, [ @$tally, $first ]],
[ $n , \@rest, $tally ];
}
return;
};
}
And here is a more effecient version:
sub choose_n_iter {
my @todo = [ shift, [ @_ ], [] ];
return sub {
while ( @todo ) {
my ( $n, $pool, $tally ) = @{ shift @todo };
return $tally if $n == 0;
return [ @$tally, @$pool ] if $n == @$pool;
next if @$pool == 0;
my ( $first, @rest ) = @$pool;
push @todo, [ $n  1, \@rest, [ @$tally, $first ]];
push @todo, [ $n , \@rest, $tally ] if @res
+t;
}
return;
};
}
Thanks goes out to HOP for the recursiontoiterator help.
 [reply] [d/l] [select] 

Your solution is more traditional. It does essentially the same thing as mine — delaying evaluation of later elements until they're needed — by keeping an explicit stack (or what should be a stack, but you seem to be using it as a queue, which explains the different order of return values). You note that it "avoids recursion altogether", but that's not really a legitimate goal. Recursion is not a bad thing in and of itself. It is often a very helpful thing.
The reason I think my solution is an improvement is that it saves you the trouble of manipulating a stack, and allows the iterative solution to more closely resemble the recursive one. In a word: simplicity. You can follow a foolproof recipe, and in five or ten minutes have an iterator that returns exactly the same set as the recursive one. Same order, same structure, same arguments. Here's the recipe (this is new since my original post):
 Emptyset return values are replaced by empty subs
 Explicit (nonrecursing) return values are assigned to an array; the iterator will be a sub that shifts that array
 A list of return values that includes recursive calls is replaced by a list of subwrapped iterators (wrapping the iterator calls in subs delays their evaluation), the iterator will be a sub that shifts its way through that list, unwrapping iterators until it finds one that returns something.
 See the rewrite of iter_choose_n (at the end of this post) for what to do when you have to accumulate data with a recursive call, as choose_n does (but Hanoi doesn't).
That last is a little complicated to describe in words, so I'll demonstrate with Towers of Hanoi.
Rewriting iter_choose_n to work with the boilerplate:
Caution: Contents may have been coded under pressure.
 [reply] [d/l] [select] 
Re: Recursivelygenerated Iterators
by runrig (Abbot) on May 20, 2005 at 04:39 UTC

I had my own idea of how to do it, and I came up with restartable iterators:
In the append iterator, when the second iterator is exhausted, it causes the first iterator to iterate, and then restarts the second iterator at the appropriate starting point.
Update:Unlike the other solutions, the solution above returns an array of indices instead of a set of elements from a list,
but that's easy to adjust: sub choose_n {
my $n = shift;
my @set = @_;
my $iter = choose_m_of_n_iter($n, scalar(@set));
sub {
@set[$iter>()];
}
}
Update: Simplified code. Which may or may not be a good thing :)  [reply] [d/l] [select] 

 [reply] 

Could you describe what a restartable iterator is, and what problem it addresses?
Well, I made up the name (and so maybe it's a badly chosen name), but what I mean is, an iterator whose behaviour you can modify after it has already started (in this case, it's a simple iterator, but I can restart it at any point each time it is exhausted). In iterating over the result set, I noticed that in choosing M out of a set of N objects (in a normal nested for loop way), the inner most loop would iterate from M to N, then from M+1 to N, then M+2 to N, etc (down to N to N), while the next outer loop would iterate from M1 to N1, then from M to N1, then M+1 to N1, etc. The outer most loop just iterates from 1 to NM+1 (assuming the array index starts at 1, though in above program it starts at zero).
That was the basic pattern at first, but it was more complicated eventually, because the first time that the innermost iterator got down to the "N to N" iteration, it would have to restart at M+1 at the next iteration. So I thought it might be easier to construct an iterator that could tell the next "inner" iterator where to start (the ending point is always the same for each iterator in this case). The iterators I've seen so far return a function that accepts no arguments, but in my "restartable_iter()", it returns a function that can accept a new starting point.
This, of course, is not a generic solution for all iteration problems, because one iterator must be aware of how to tell the next iterator in line what to do. But it was fun to come up with.
 [reply] 
Re: Recursivelygenerated Iterators
by Roy Johnson (Monsignor) on Jun 02, 2005 at 01:53 UTC

Just another example of a conversion (using the boilerplate recipe). I'm posting mostly so it will be here to refer back to if I revisit the topic. It's interestingly different from the other examples, and the recipe still works well (although I get a curious Use of uninitialized value in array element warning if I don't turn those warnings off (which I do, below)).
What I'm generating is permutations, like Algorithm::Loops NextPermute.
#!perl
use strict;
use warnings;
sub no_repeat_combos {
return [@_] unless @_ > 1;
# Find the first occurrence of each unique element
my %seen;
defined($seen{$_[$_]}) or $seen{$_[$_]} = $_ for 0..$#_;
# For each unique element, stick it on the front of each
# of the norepeatcombos of the rest
map {
my $first_pos = $_;
my @rest = @_[ grep {$first_pos != $_} 0..$#_ ];
map [$_[$first_pos], @$_], no_repeat_combos(@rest);
} (sort {$a <=> $b} values %seen);
}
sub nrc_iter {
# Base cases get assigned to an array, which the iterator shifts t
+hrough
my @base_case = ([@_]);
return sub{ shift @base_case } unless @_ > 1;
# Find the first occurrence of each unique element
my %seen;
defined($seen{$_[$_]}) or $seen{$_[$_]} = $_ for 0..$#_;
# For each unique element, stick it on the front of each
# of the norepeatcombos of the rest
my @arg_list = @_;
my @sub_iter =
map {
my $first_pos = $_;
my @rest = @_[ grep {$first_pos != $_} 0..$#_ ];
sub {
my $recurse = nrc_iter(@rest);
my $set;
no warnings 'uninitialized';
sub { ($set = $recurse>()) ? [$arg_list[$first_pos], @$set]
+ : () }
}
} sort {$a <=> $b} values %seen;
# Below here is boilerplate: if you've done the above steps right,
+ just plug
# this in, and it works. It returns the first iterator from the li
+st that
# returns anything.
# Grab and unwrap an iterator from the list
my $iter = (shift @sub_iter)>();
return sub {
my $rval;
$iter = (shift @sub_iter)>()
until ($rval = $iter>() or @sub_iter == 0);
return $rval;
}
}
for ([1], [1,1], [1,2], [qw(a b a)], [qw(a b b a)]) {
print "=== @$_ ===\n";
my $i = nrc_iter(@$_);
print " @$_\n" while $_ = $i>();
}
Caution: Contents may have been coded under pressure.
 [reply] [d/l] [select] 

