Re^6: OT: Finding Factor Closest To Square Root

BrowserUk,
The following code demonstrates that this is not actually true:

#!/usr/bin/perl
use strict;
use warnings;
use Math::Pari qw/factorint PARImat/;

print nearest_sqrt( $ARGV[0] );

sub prime_factors {
    my $N = shift;
    my %p_fact = PARImat( factorint( $N ) ) =~ /(\d+)/g;
    return map { ($_) x $p_fact{$_} } sort { $a <=> $b } keys %p_fact;
+ 
}

sub nearest_sqrt {
    my $N = shift;
    my $sqrt = sqrt( $N );
    my @factor = prime_factors( $N );
    return 1 if @factor == 1;
    my $end = $#factor;
    my @subset;
    my ($pos, $mode) = (-1, 1);

    my %seen;
    my $get_factor = sub {
        if ( $seen{ "@factor[ @subset ]" }++ ) {
            if ( $mode == 1 ) {
                push @subset, $pos + 1 .. $end;
                ++$mode;
            }
            return undef;
        }
        my $f = 1;
        $f *= $_ for @factor[ @subset ];
        return $f < $sqrt ? $f : undef;
    };


    my %dispatch = (
        1 => sub {
            push @subset, ++$pos;
            ++$mode if $subset[ -1 ] == $end;
            return 1;
        },
        2 => sub {
            splice(@subset, $#subset - 1, 1); 
            return $mode++;
        },
        3 => sub {
            return () if $subset[ 0 ] == $end; 
            $pos = $subset[ -2 ] + 1;
            splice(@subset, $#subset - 1, 2, $pos); 
            return $mode = 1;
        },
    );

    my ($winner, $offset);

    while ( $dispatch{ $mode }->() ) {
        my $factor = $get_factor->() or next;

        my $diff = ($N / $factor) - $factor;
        if ( ! defined $offset || $diff < $offset ) {
            ($winner, $offset) = ($factor, $diff);
        }
    }
    return $winner;
}
[download]

If a subset has been seen before, it is skipped prior to determining the product of the subset
If a subset is the start of a progression of subsets previously seen, the entire progression is skipped
If a factor is determined to be larger than the sqrt, determining if it is the closest is skipped

Unfortunately, I do not know if this is the fastest implementation. I have several variations ranging from bitstrings to avoiding push/slice. The trouble is that I have found an apparent bug in Math::Pari so I am not sure which solution should be fastest. The bug only seems to manifest itself when I add the sqrt() code. To see the bug for yourself using 24777695232, add print "$_\n" for @factor; right after the array is created. To see it go away, comment out the $sqrt definition and change the return statement to return $f.

Cheers - L~R

Comment on Re^6: OT: Finding Factor Closest To Square Root Select or Download Code

Replies are listed 'Best First'.
Re^7: OT: Finding Factor Closest To Square Root by tall_man (Parson) on Feb 27, 2005 at 22:07 UTC
I got an entirely different error when I did a sqrt($N) before factoring: `DB<1> n main::(factsqrt.pl:5): my $N = $_; DB<1> n main::(factsqrt.pl:6): my $sqrt2 = sqrt($N); DB<1> n main::(factsqrt.pl:7): my $a = factorint($N); DB<1> n DB<1> p $a Mat([-1,1])` [download] However, this is not a bug in Math::Pari, but a side effect of perl's conversion of string to numbers. The number "24777695232" first comes in as a string from the command line. Applying the function "sqrt" to it converts it to a double. When you pass it to Math::Pari, it goes in as a double. Naturally, the module fails to do the right thing. If you call a Math::Pari function first, like factorint, it automatically converts the string into a long Pari integer and operates on that.	[reply] [d/l]
Re^8: OT: Finding Factor Closest To Square Root by Limbic~Region (Chancellor) on Feb 28, 2005 at 19:50 UTC
tall_man, Interesting. That would imply I couldn't do this either: `my $a = factorint( 24777695232 );` [download] It seems like a bug to me, but I am finding the documentation on Math::Pari lacking. Is there a good tutorial that covers all of the functions offered and examples on how to use them? Cheers - L~R	[reply] [d/l]
Re^7: OT: Finding Factor Closest To Square Root by BrowserUk (Patriarch) on Feb 26, 2005 at 08:13 UTC
The bug only seems to manifest itself when I add the sqrt() code. .... To see it go away, comment out the $sqrt definition and change the return statement to return $f. Sorry Limbic~Region, I do not follow these instructions? The only definition of $sqrt I can see is `my $sqrt = sqrt( $N );` But I do not understand what you mean by "change the return statement to return $f."? Examine what is said, not who speaks. Silence betokens consent. Love the truth but pardon error.	[reply] [d/l]
Re^8: OT: Finding Factor Closest To Square Root by Limbic~Region (Chancellor) on Feb 26, 2005 at 14:10 UTC
BrowserUk, Sorry - it was past my bedtime last night when I posted. The code in question (with a few extra lines for context looks like this: `my $N = shift; my $sqrt = sqrt( $N ); my @factor = prime_factors( $N ); return 1 if @factor == 1; # ... rest of the code return $f < $sqrt ? $f : undef;` [download] To see the problem, add the following line: `my $N = shift; my $sqrt = sqrt( $N ); my @factor = prime_factors( $N ); # ADD THIS LINE print "$_\n" for @factor; return 1 if @factor == 1; # ... rest of the code return $f < $sqrt ? $f : undef;` [download] And to see the problem go away: `my $N = shift; my @factor = prime_factors( $N ); # KEEP THIS LINE print "$_\n" for @factor; return 1 if @factor == 1; # ... rest of the code return $f;` [download] You can see the list of factors changes with the presence of the sqrt() code (at least it does on my machine). Cheers - L~R	[reply] [d/l] [select]
Re^9: OT: Finding Factor Closest To Square Root by BrowserUk (Patriarch) on Feb 26, 2005 at 15:00 UTC
Hmm. I get the same results both ways: `P:\test>findstr # LR-nearFactor.pl #!/usr/bin/perl # my $sqrt = sqrt( $N ); my $end = $#factor; return $f ; #< $sqrt ? $f : undef; splice(@subset, $#subset - 1, 1); splice(@subset, $#subset - 1, 2, $pos); P:\test>LR-nearFactor.pl 24777695232 11 1 P:\test>findstr # LR-nearFactor.pl #!/usr/bin/perl my $end = $#factor; splice(@subset, $#subset - 1, 1); splice(@subset, $#subset - 1, 2, $pos); P:\test>LR-nearFactor.pl 24777695232 11 1` [download] Which version of Parei do you have? I have `==================== Name: Math-Pari Version: 2.010603 Author: Ilya Zakharevich (cpan@ilyaz.org) Title: Math-Pari Abstract: Perl interface to PARI. InstDate: 21:18:09 2005 Location: http://ppm.ActiveState.com/cgibin/PPM/ppmserver-5.8-windows. +pl?urn:/PPMServer Available Platforms: 1. MSWin32-x86-multi-thread-5.8 ====================` [download] It's kind of hard to see how using the built in sqrt function would influence factorint()> but have you tried moving the call to sqrt after the call to factorint()? Anyway, trying to get beyond that, I picked a few ideas from tall_man's code (which is definitely the benchmark in this thread) and modified your code to read as: `#!/usr/bin/perl use strict; use warnings; use Math::Pari qw/:int factorint sqrtint PARImat/; my $N = '1' . '0' x $ARGV[0] . '1'; print "Result for $N is ", nearest_sqrt( $N ); sub prime_factors { my $N = shift; my %p_fact = PARImat( factorint( $N ) ) =~ /(\d+)/g; return map { ($_) x $p_fact{$_} } sort { $a <=> $b } keys %p_fact; } sub nearest_sqrt { my $N = shift; my $sqrt = sqrtint( $N ); my @factor = prime_factors( $N ); ....` [download] As I understand it, ':int', makes all integers in the program Pari compatible?. And using `sqrtint()` ought to bypass any imcompatibilities. The fudge with `$ARGV[ 0 ]` at the top allows me to test the code on some really big numbers without having to type out all the digits--I applied the same hack to tall_man's code, and then ran a few tests. The following is the results from a few carefully chosen (particularly hard) examples timing both programs. Going by the time your program takes in comparison to tall_man's, you appear to be doing a similar amount of work, but your code is returning -1, which I do not understand? P:\test>timethis TM-nearFactor.pl 55 & timethis LR-nearFactor.pl 55 TimeThis : Command Line : TM-nearFactor.pl 55 TimeThis : Start Time : Sat Feb 26 14:38:33 2005 Result for 100000000000000000000000000000000000000000000000000000001 i +s 7376575663406069736503138401 TimeThis : Command Line : TM-nearFactor.pl 55 TimeThis : Start Time : Sat Feb 26 14:38:33 2005 TimeThis : End Time : Sat Feb 26 14:38:36 2005 TimeThis : Elapsed Time : 00:00:03.281 TimeThis : Command Line : LR-nearFactor.pl 55 TimeThis : Start Time : Sat Feb 26 14:38:37 2005 Result for 100000000000000000000000000000000000000000000000000000001 i +s -1 TimeThis : Command Line : LR-nearFactor.pl 55 TimeThis : Start Time : Sat Feb 26 14:38:37 2005 TimeThis : End Time : Sat Feb 26 14:38:40 2005 TimeThis : Elapsed Time : 00:00:03.265 P:\test>timethis TM-nearFactor.pl 58 & timethis LR-nearFactor.pl 58 TimeThis : Command Line : TM-nearFactor.pl 58 TimeThis : Start Time : Sat Feb 26 14:38:50 2005 Result for 10000000000000000000000000000000000000000000000000000000000 +1 is 22665854592333022426775023 TimeThis : Command Line : TM-nearFactor.pl 58 TimeThis : Start Time : Sat Feb 26 14:38:50 2005 TimeThis : End Time : Sat Feb 26 14:39:07 2005 TimeThis : Elapsed Time : 00:00:17.578 TimeThis : Command Line : LR-nearFactor.pl 58 TimeThis : Start Time : Sat Feb 26 14:39:07 2005 Result for 10000000000000000000000000000000000000000000000000000000000 +1 is -1 TimeThis : Command Line : LR-nearFactor.pl 58 TimeThis : Start Time : Sat Feb 26 14:39:07 2005 TimeThis : End Time : Sat Feb 26 14:39:25 2005 TimeThis : Elapsed Time : 00:00:17.578 [download] The timings of these two particularly difficult runs, going by how long tall_man'code takes relative to most other runs like: P:\test>timethis TM-nearFactor.pl 57 & timethis LR-nearFactor.pl 57 TimeThis : Command Line : TM-nearFactor.pl 57 TimeThis : Start Time : Sat Feb 26 14:38:49 2005 Result for 10000000000000000000000000000000000000000000000000000000001 + is 846610559160061 TimeThis : Command Line : TM-nearFactor.pl 57 TimeThis : Start Time : Sat Feb 26 14:38:49 2005 TimeThis : End Time : Sat Feb 26 14:38:50 2005 TimeThis : Elapsed Time : 00:00:00.140 TimeThis : Command Line : LR-nearFactor.pl 57 TimeThis : Start Time : Sat Feb 26 14:38:50 2005 Result for 10000000000000000000000000000000000000000000000000000000001 + is -1 TimeThis : Command Line : LR-nearFactor.pl 57 TimeThis : Start Time : Sat Feb 26 14:38:50 2005 TimeThis : End Time : Sat Feb 26 14:38:50 2005 TimeThis : Elapsed Time : 00:00:00.109 [download] which despite being nearly as big a number, is solved in under a 1/5th of a second rather than 17+ seconds above. The very close similarity in timing between your code and tall_man's suggests that you are probably doing a very similar amount of work and possibly arriving at the same result, but for some reason, failing to display it?. Maybe a rounding error somewhere. I tried to follow this through--but as you know your code better than I, you might find the solution more quickly. Either way, your determination of the appropriate sequence of tests and short-ciruiting is very impressive. I congratulate you++ :). Examine what is said, not who speaks. Silence betokens consent. Love the truth but pardon error.	[reply] [d/l] [select]
Re^10: OT: Finding Factor Closest To Square Root by Limbic~Region (Chancellor) on Feb 27, 2005 at 14:29 UTC
Re^11: OT: Finding Factor Closest To Square Root by BrowserUk (Patriarch) on Feb 27, 2005 at 16:10 UTC


Think about Loose Coupling
	PerlMonks