http://www.perlmonks.org?node_id=440742


in reply to Re: Challenge: Another Infinite Lazy List
in thread Challenge: Another Infinite Lazy List

This particular challenge can be done much more simply with a filter approach instead of merging multiple lists.

This is the precise description of my solution (except that no "filtering" is required; it's enough to merge the lists of multiples).

Update: see here and here.

the lowliest monk

Replies are listed 'Best First'.
Re^3: Challenge: Another Infinite Lazy List
by tall_man (Parson) on Mar 18, 2005 at 18:13 UTC
    The two approaches are not the same at all. Say there are N numbers and R prime factors. The merge methods use:

    N*(1/p1 + 1/p2 + ... 1/pR)*(R-1)

    comparison operations to process a list of size N.

    The filter method tests each number with a single gcd calculation, which is O(logN) and is independent of the total number of factors. So to produce a list of size N should be O(NlogN).

    Depending on the size and quantity of factors, one or the other will be better. I would expect the filter to be better when there are many small factors, and the merge to be better when there are few factors or the factors are large.

      Here's a filter that is O(F=number of factors * P=product of factors) + O(N) to check N numbers. It takes advantage of the fact that the output is periodic: if N is a member, then N + P is also a member. This could be written caching-style, so you only have to do your gcd check if you don't yet have a result for N % P; that would improve your order of efficiency without the potential big up-front cost.
      use strict; use warnings; { my @bases = (2,3,5); my $product = 1; $product *= $_ for @bases; my @key; for my $test (0..$product-1) { $key[$test] = grep {$test % $_ == 0} @bases; } print "Key is <@key>\n"; my $iteration = 1; sub limbic_sequence { $iteration = shift if (@_); while ($iteration++) { return $iteration if ($key[$iteration % @key]); } } } print join ', ', map limbic_sequence, 1..50; print "\n"; printf "First after 100000 is: %d\n", limbic_sequence(100000); print join ', ', map limbic_sequence, 1..50; print "\n";

      Caution: Contents may have been coded under pressure.

      You're right. I misread your original post. What threw me off was the mention of streams. In the solution you present (which is indeed entirely different from merging lists), using lazy lists seems forced to me, since their properties don't get you anything that you couldn't get very easily and directly with grep and the '..' operator:

      use strict; # use Stream; use Math::Pari qw(gcd); # We can test all the factors at once using the greatest common diviso +r algorithm. my @factors = @ARGV; my $test_number = 1; $test_number *= $_ for @factors; die "Must provide a list of non-zero factors\n" unless $test_number > +1; sub has_common_factor { gcd($_[0], $test_number) != 1 } # my $integers = Stream::tabulate( sub { $_[0] }, 1); # my $challenge = $integers->filter(\&has_common_factor); # $challenge->show(50); print "@{[grep has_common_factor($_), 1..50]}\n";
      Thanks for pointing out the Streams module. It does some very nice things which I think I should incorporate in my stab at implementing lazy lists. It doesn't use memoization, however, which I think is absolutely essential to a workable implementation of lazy lists, at least in Perl5. What's most curious about this is that Dominus is the author of the Memoize module! Then again, the module is not available from CPAN. I suspect that both these facts have to do with realizing that this sort of object requires a little more syntactic support from Perl to be worth doing.


      Update: Oh, yes. There's memoization in Stream.pm, all right; it's just done very discreetly (and elegantly, IMO):

      sub tail { my $t = $_[0]{t}; if (ref $t eq CODE) { # It is a promise $_[0]{t} = &$t; # <--- there! } $_[0]{t}; }


      BTW the execution time of my merged-lists solution is not growing anywhere nearly as fast as the formula you gave (not that it matters a whole lot, but I give the stats below if anyone's interested). Still, the gcd-based solution is noticeably faster; whether this is the result of the fact that most of its heavy lifting (the gcd calculation) is done by a C routine, or something more fundamental is not yet clear to me.

      the lowliest monk

        There are a couple of advantages of the lazy list form over grep.

        1) You can get an exact number of values out of the sequence without having to know the maximum number range in advance. For example, your sample code produces only 36 values, not the expected 50. You would have to feed it a range up to 273 to get a full 50.

        2) You could extract and print values in a loop without building a large result array (a map in void context could do the same, I suppose).

        My estimates for timing were rough back-of-the-envelope calculations, so I'm not surprised that they are a bit off. There are a number of things I didn't take into account, such as the overlaps where more than one prime is a factor, and the fact that more numbers are eliminated with more factors, so you have to search bigger integers to finish the list. Thanks for your timing tests.

        Math::Pari is a highly optimized math engine, so I'm not too surpised that single gcd checks beat multiple merges. But if you start with primes at 101 and above instead, the merge algorithm will win.