Limbic~Region has asked for the wisdom of the Perl Monks concerning the following question:

I am trying to improve my solution to this problem by incorporating the shortcuts I outlined here. This requires ordering the powerset of prime factors in some sort of progressive order:
input = 2, 3, 5, 7 output: 2 2 3 2 3 5 2 3 5 7 ------- 2 3 7 ------- 2 5 2 5 7 ------- 2 7 ------- 3 3 5 3 5 7 ------- 3 7 ------- 5 5 7 ------- 7 -------

In this simple example, the number is 210 so any factor larger than 14 need not be considered. This means that after checking 2*3*5, we can skip 2*3*5*7 and after we check 3*5, we can skip 3*5*7.

Additional savings can be found if the primes repeat since it will lead to duplicate subsets. Prior to determining the product of the subset, we can easily determine if we have already seen this subset and skip. If this previously seen subset was also too large, we can skip the entire progression.

Anyone have any ideas about how to implement this?

Cheers - L~R

Update: You can see the result of this exercise here

Replies are listed 'Best First'.
Re: Generating powerset with progressive ordering
by blokhead (Monsignor) on Feb 25, 2005 at 16:26 UTC
    I had an easier time thinking of this as manipulating the characteristic sequence of the subsets (i.e, a string of 0's and 1's, where the first bit is a 1 if the first factor is included in the subset, etc). If you look at the characteristic sequences given in your example, you can see that there are three rules being applied (you can print out $str to see what's going on under the covers). Anyway, this code does its manipulations on the characteristic sequence and converts it to the appropriate subset:
    sub iter { my @factors = @_; my $str; return sub { if (not defined $str) { $str = "1" . ("0" x $#factors); return map { substr($str, $_, 1) ? $factors[$_] : () } 0 .. $#factors; } for ($str) { s/0(0*)$/1$1/ or s/11$/01/ or s/^(.*)10(.*)$/"${1}01" . "0" x length $2/e or return; } return map { substr($str, $_, 1) ? $factors[$_] : () } 0 .. $#factors; }; } my $i = iter( 2, 3, 5, 7 ); while (my @s = $i->()) { print "@s\n"; } __END__ 2 2 3 2 3 5 2 3 5 7 2 3 7 2 5 2 5 7 2 7 3 3 5 3 5 7 3 7 5 5 7 7
    I'd be interested to see this simplified a bit. I know you mentioned in the CB that tye had an idea for a solution, and I wonder if he's able to implement it with manipulations to the subsets themselves instead of their characteristic sequences.

    You also mentioned wanting to know when the "next phase" of iterations began (the horizontal lines in your example). You can figure this out by checking which substitution rule was actually applied.

    Update: Here is the code with the "next phase" markers:


      Huh - so apparently your method and my evil method ended up being the same, or at least very similar. Note that you can improve this slightly by consolidating your last two expressions and eliminating the need for an "e" flag on the substitutions:
      for ($str) { s/0(0*)$/1$1/ or s/1(0*)1$/01$1/ or return; }
      And actually, we can combine your method and the last bit in my post to get this somewhat natural structure for looping through the possibilities:
      sub nextmask { if ($_[0] =~ s/0(0*)$/1$1/) {1;} elsif ($_[0] =~ s/1(0*)1$/01$1/) {2;} else {0;} } my @factors = qw(2 3 5 7); my $mask = '0' x scalar(@factors); while (my $transtype = nextmask($mask)) { print "BREAK\n" if ($transtype > 1); # do stuff print join " ", map { substr($mask, $_, 1) ? $factors[$_] : ' ' } 0 .. $#factors; print "\n"; }
      (I've never liked do { ... } while() loops)
      -- @/=map{[/./g]}qw/.h_nJ Xapou cets krht ele_ r_ra/; map{y/X_/\n /;print}map{pop@$_}@/for@/
      I doubt this is how tye would have implemented it, but this is 1 implementation of his 3 simple rules:
      #!/usr/bin/perl use strict; use warnings; my $next = iter_powerset( 1,2,3,4,5 ); while ( my @combo = $next->() ) { print "@combo\n"; } sub iter_powerset { my @factor = @_; my $end = $#factor; my @subset = (undef) x $end; my ($pos, $mode) = (-1, 1); my $return = sub { return @factor[ grep defined, @subset ] }; my %dispatch = ( 1 => sub { ++$pos; $subset[ $pos ] = $pos; ++$mode if $pos == $end; $return->(); }, 2 => sub { $subset[ $pos - 1 ] = undef; ++$mode; $return->(); }, 3 => sub { $subset[ $pos-- ] = undef; while ( $pos >= 0 ) { last if defined $subset[ $pos ]; --$pos; } $subset[ $pos++ ] = undef; return () if ! $pos; $subset[ $pos ] = $pos; $mode = 1; $return->(); }, ); return sub { $dispatch{ $mode }->() }; }
      I am now going to play with making it closer to what I want for the other task.

      Cheers - L~R

      Since tye's 3 simple rules were uttered in passing on the CB, I will summarize them here to help explain the code
      • Mode 1: Fill to the right until you reach the end
      • Mode 2: Remove second to last element
      • Mode 3: Remove last element, increment new last element
        In hopes of better understanding what you were doing, I implemented powersets first as a recursive function and then as an iterator that tries to parallel the recursive version. I thought you might be interested to see what I came up with.

        Caution: Contents may have been coded under pressure.
Re: Generating powerset with progressive ordering
by tall_man (Parson) on Feb 25, 2005 at 18:47 UTC
    I'd like to point out that in your solution you are already using Math::Pari for "factorint", so you could also use "divisors" to get the complete powerset in sorted order.

    Update: adding code at Limbic~Region's request.

    #! /usr/bin/perl use strict; use Math::Pari qw(:int factorint divisors); for (@ARGV) { my $N = $_; my $a = factorint($N); my $div = divisors($a); print "sorted list of divisors of $N is ",$div,"\n"; }

    For 12345, this prints:

    perl 12345 sorted list of divisors of 12345 is [1,3,5,15,823,2469,4115,12345]

    However, I see you are looking for a lazy evaluation method that will produce the whole list in sorted order, one at at time. I saw an example once in Haskell of this sort of thing.

    Suppose N = (p1^a1)*(p2^a2)*...(pr^ar) for primes in sorted order.

    Form a1+1 lists: the first where p1 is not a factor, the second where p1 is a factor only once, ... up to the last list where p1 is a factor a1 times. Merge-sort these lists, in each case taking the lowest item available. That is your answer.

    These lists can be created recursively, by taking the next highest factor to all possible powers and merging in the same way.

    Update: I have written some Haskell code to do the recursive ennumeration of the factor list. Since Haskell does lazy evaluation, this code should compute the "factor square root" without multiplying out all the factors:

    module Powerset where {- Expand muliplies of two list of numbers. For example: mult_expand [1, 2, 4] [1, 5] Produces: [[1,5],[2,10],[4,20]] -} mult_expand p [] = p:[] mult_expand p lst = map (\x -> map (*x) lst) p {- expand the powers of x from 0 to exp -} pows x exp = map (\e -> x^e) [0..exp] {- expand the powers the first item in a pair from 0 to the second item in the pair. -} powerlst p = pows (fst p) (snd p) {- One stage of a merge sort -} merge [] ys = ys merge xs [] = xs merge (x:xs) (y:ys) = if x <= y then x : merge xs (y:ys) else y : merge (x:xs) ys {- Do multiple merges to get an ordered list. -} bigmerge lsts = foldl merge [] lsts {- The complete power set of a list of factor/exponent pairs. For exa +mple: powerset [(2,3),(5,1)] Produces: [1,2,4,5,8,10,20,40] -} powerset [] = [] powerset (x:xs) = bigmerge (mult_expand (powerlst x) (powerset xs)) {- Helper function for factor_sqrt. Finds the last list value less than or equal to the given limit. -} fsqrt lim lastgood [] = lastgood fsqrt lim lastgood (x:xs) = if (x > lim) then lastgood else fsqrt lim x xs {- Find the last divisor less than the square root, given the number and its factors -} factor_sqrt n factors = fsqrt (sqrt n) 0 (powerset factors)

    For example, under Hugs:

    Prelude> :l Powerset.hs Powerset> factor_sqrt 24777695232 [(2,13),(3,8),(461,1)] 149364.0
      Cool, a Haskell solution - you should be on the Pugs team ;-)

      Cheers - L~R

      Regarding using Math::Pari's divisors. This is really cool and I need to learn more about this module. It is however a violation of the original question as stated. Only the prime factorization is to be known initially. I will look at the second part of your response in bit.

      Cheers - L~R

        I object to calling the use of divisors a "violation" of the original question. Given the prime factorization, it is a simple matter to multiply out all the possible combinations, giving the list of divisors. Math::Pari's divisor function just saves the work of doing the multiplications and sorting the results.
Re: Generating powerset with progressive ordering
by fizbin (Chaplain) on Feb 25, 2005 at 17:27 UTC
    Okay, first off, here's a somewhat general method, although it has the disadvantage of being highly recursive. You saw this already from the chatterbox:
    sub printsubsets1 { my ($prefix, @elemarray) = @_; if (!@elemarray) {return;} my $elem = shift @elemarray; print "$prefix$elem\n"; printsubsets1 ("$prefix$elem", @elemarray); printsubsets1 ("$prefix ", @elemarray); } sub printsubsets { printsubsets1("",@_);} printsubsets(qw(a b c d e));

    Now, here's a very evil, but iterative, method that only works on sets of single characters, but that should be enough since anything enumerating 2**26 possiblities is already going to take way, way too long. You could easily change the few lines around the "print" statement so that it dealt with sets of arbitrary words, but I'm just focused on getting the evil out there:

    sub printsubsets2 { my $stuff = join('',@_); my $mask = "\xff" . ("\x00" x (length($stuff)-1)); do { my $str = $mask & $stuff; $str =~ y/\x00/ /; print $str,"\n"; if ($mask =~ s/\xff\x00(\x00*)$/\xff\xff$1/) {} elsif ($mask =~ s/\xff\x00(\x00*)\xff$/\x00\xff$1\x00/) {} elsif ($mask =~ s/\xff{2}$/\x00\xff/) {} else {return;} } while (1); } printsubsets2(qw(a b c d e));

    You can of course take the core of this out and use it somewhat like this:

    sub initialmask { my $nelem = shift; return "\xff" . ("\x00" x ($nelem-1)); } sub nextmask { if ($_[0] =~ s/\xff\x00(\x00*)$/\xff\xff$1/) {1;} elsif ($_[0] =~ s/\xff\x00(\x00*)\xff$/\x00\xff$1\x00/) {1;} elsif ($_[0] =~ s/\xff{2}$/\x00\xff/) {1;} else {0;} } # later in your code: my $mask = initialmask($nfactors); do { # blah blah blah $mask } while (nextmask($mask));
    Not quite as space-efficient as bitstrings, and it doesn't quite generalize as easily if there are duplicates in your "set", but pretty fast. Of course, if you're taking $mask apart I strongly recommend switching to "0" and "1", (instead of \x00 and \xff) since it makes the regular expressions significantly more readable.
    -- @/=map{[/./g]}qw/.h_nJ Xapou cets krht ele_ r_ra/; map{y/X_/\n /;print}map{pop@$_}@/for@/