Beefy Boxes and Bandwidth Generously Provided by pair Networks
Think about Loose Coupling
 
PerlMonks  

Re^6: OT: Finding Factor Closest To Square Root

by Limbic~Region (Chancellor)
on Feb 26, 2005 at 06:52 UTC ( [id://434740]=note: print w/replies, xml ) Need Help??


in reply to Re^5: OT: Finding Factor Closest To Square Root
in thread OT: Finding Factor Closest To Square Root

BrowserUk,
The following code demonstrates that this is not actually true:
#!/usr/bin/perl use strict; use warnings; use Math::Pari qw/factorint PARImat/; print nearest_sqrt( $ARGV[0] ); sub prime_factors { my $N = shift; my %p_fact = PARImat( factorint( $N ) ) =~ /(\d+)/g; return map { ($_) x $p_fact{$_} } sort { $a <=> $b } keys %p_fact; + } sub nearest_sqrt { my $N = shift; my $sqrt = sqrt( $N ); my @factor = prime_factors( $N ); return 1 if @factor == 1; my $end = $#factor; my @subset; my ($pos, $mode) = (-1, 1); my %seen; my $get_factor = sub { if ( $seen{ "@factor[ @subset ]" }++ ) { if ( $mode == 1 ) { push @subset, $pos + 1 .. $end; ++$mode; } return undef; } my $f = 1; $f *= $_ for @factor[ @subset ]; return $f < $sqrt ? $f : undef; }; my %dispatch = ( 1 => sub { push @subset, ++$pos; ++$mode if $subset[ -1 ] == $end; return 1; }, 2 => sub { splice(@subset, $#subset - 1, 1); return $mode++; }, 3 => sub { return () if $subset[ 0 ] == $end; $pos = $subset[ -2 ] + 1; splice(@subset, $#subset - 1, 2, $pos); return $mode = 1; }, ); my ($winner, $offset); while ( $dispatch{ $mode }->() ) { my $factor = $get_factor->() or next; my $diff = ($N / $factor) - $factor; if ( ! defined $offset || $diff < $offset ) { ($winner, $offset) = ($factor, $diff); } } return $winner; }
  • If a subset has been seen before, it is skipped prior to determining the product of the subset
  • If a subset is the start of a progression of subsets previously seen, the entire progression is skipped
  • If a factor is determined to be larger than the sqrt, determining if it is the closest is skipped
Unfortunately, I do not know if this is the fastest implementation. I have several variations ranging from bitstrings to avoiding push/slice. The trouble is that I have found an apparent bug in Math::Pari so I am not sure which solution should be fastest. The bug only seems to manifest itself when I add the sqrt() code. To see the bug for yourself using 24777695232, add print "$_\n" for @factor; right after the array is created. To see it go away, comment out the $sqrt definition and change the return statement to return $f.

Cheers - L~R

Replies are listed 'Best First'.
Re^7: OT: Finding Factor Closest To Square Root
by tall_man (Parson) on Feb 27, 2005 at 22:07 UTC
    I got an entirely different error when I did a sqrt($N) before factoring:
    DB<1> n main::(factsqrt.pl:5): my $N = $_; DB<1> n main::(factsqrt.pl:6): my $sqrt2 = sqrt($N); DB<1> n main::(factsqrt.pl:7): my $a = factorint($N); DB<1> n DB<1> p $a Mat([-1,1])

    However, this is not a bug in Math::Pari, but a side effect of perl's conversion of string to numbers.

    The number "24777695232" first comes in as a string from the command line. Applying the function "sqrt" to it converts it to a double. When you pass it to Math::Pari, it goes in as a double. Naturally, the module fails to do the right thing. If you call a Math::Pari function first, like factorint, it automatically converts the string into a long Pari integer and operates on that.

      tall_man,
      Interesting. That would imply I couldn't do this either:
      my $a = factorint( 24777695232 );
      It seems like a bug to me, but I am finding the documentation on Math::Pari lacking. Is there a good tutorial that covers all of the functions offered and examples on how to use them?

      Cheers - L~R

Re^7: OT: Finding Factor Closest To Square Root
by BrowserUk (Patriarch) on Feb 26, 2005 at 08:13 UTC
    The bug only seems to manifest itself when I add the sqrt() code. .... To see it go away, comment out the $sqrt definition and change the return statement to return $f.

    Sorry Limbic~Region, I do not follow these instructions?

    The only definition of $sqrt I can see is     my $sqrt = sqrt( $N );

    But I do not understand what you mean by "change the return statement to return $f."?


    Examine what is said, not who speaks.
    Silence betokens consent.
    Love the truth but pardon error.
      BrowserUk,
      Sorry - it was past my bedtime last night when I posted. The code in question (with a few extra lines for context looks like this:
      my $N = shift; my $sqrt = sqrt( $N ); my @factor = prime_factors( $N ); return 1 if @factor == 1; # ... rest of the code return $f < $sqrt ? $f : undef;
      To see the problem, add the following line:
      my $N = shift; my $sqrt = sqrt( $N ); my @factor = prime_factors( $N ); # ADD THIS LINE print "$_\n" for @factor; return 1 if @factor == 1; # ... rest of the code return $f < $sqrt ? $f : undef;
      And to see the problem go away:
      my $N = shift; my @factor = prime_factors( $N ); # KEEP THIS LINE print "$_\n" for @factor; return 1 if @factor == 1; # ... rest of the code return $f;
      You can see the list of factors changes with the presence of the sqrt() code (at least it does on my machine).

      Cheers - L~R

        Hmm. I get the same results both ways:

        P:\test>findstr # LR-nearFactor.pl #!/usr/bin/perl # my $sqrt = sqrt( $N ); my $end = $#factor; return $f ; #< $sqrt ? $f : undef; splice(@subset, $#subset - 1, 1); splice(@subset, $#subset - 1, 2, $pos); P:\test>LR-nearFactor.pl 24777695232 11 1 P:\test>findstr # LR-nearFactor.pl #!/usr/bin/perl my $end = $#factor; splice(@subset, $#subset - 1, 1); splice(@subset, $#subset - 1, 2, $pos); P:\test>LR-nearFactor.pl 24777695232 11 1

        Which version of Parei do you have? I have

        ==================== Name: Math-Pari Version: 2.010603 Author: Ilya Zakharevich (cpan@ilyaz.org) Title: Math-Pari Abstract: Perl interface to PARI. InstDate: 21:18:09 2005 Location: http://ppm.ActiveState.com/cgibin/PPM/ppmserver-5.8-windows. +pl?urn:/PPMServer Available Platforms: 1. MSWin32-x86-multi-thread-5.8 ====================

        It's kind of hard to see how using the built in sqrt function would influence factorint()> but have you tried moving the call to sqrt after the call to factorint()?

        Anyway, trying to get beyond that, I picked a few ideas from tall_man's code (which is definitely the benchmark in this thread) and modified your code to read as:

        #!/usr/bin/perl use strict; use warnings; use Math::Pari qw/:int factorint sqrtint PARImat/; my $N = '1' . '0' x $ARGV[0] . '1'; print "Result for $N is ", nearest_sqrt( $N ); sub prime_factors { my $N = shift; my %p_fact = PARImat( factorint( $N ) ) =~ /(\d+)/g; return map { ($_) x $p_fact{$_} } sort { $a <=> $b } keys %p_fact; } sub nearest_sqrt { my $N = shift; my $sqrt = sqrtint( $N ); my @factor = prime_factors( $N ); ....

        As I understand it, ':int', makes all integers in the program Pari compatible?. And using sqrtint() ought to bypass any imcompatibilities.

        The fudge with $ARGV[ 0 ] at the top allows me to test the code on some really big numbers without having to type out all the digits--I applied the same hack to tall_man's code, and then ran a few tests.

        The following is the results from a few carefully chosen (particularly hard) examples timing both programs.

        Going by the time your program takes in comparison to tall_man's, you appear to be doing a similar amount of work, but your code is returning -1, which I do not understand?

        P:\test>timethis TM-nearFactor.pl 55 & timethis LR-nearFactor.pl 55 TimeThis : Command Line : TM-nearFactor.pl 55 TimeThis : Start Time : Sat Feb 26 14:38:33 2005 Result for 100000000000000000000000000000000000000000000000000000001 i +s 7376575663406069736503138401 TimeThis : Command Line : TM-nearFactor.pl 55 TimeThis : Start Time : Sat Feb 26 14:38:33 2005 TimeThis : End Time : Sat Feb 26 14:38:36 2005 TimeThis : Elapsed Time : 00:00:03.281 TimeThis : Command Line : LR-nearFactor.pl 55 TimeThis : Start Time : Sat Feb 26 14:38:37 2005 Result for 100000000000000000000000000000000000000000000000000000001 i +s -1 TimeThis : Command Line : LR-nearFactor.pl 55 TimeThis : Start Time : Sat Feb 26 14:38:37 2005 TimeThis : End Time : Sat Feb 26 14:38:40 2005 TimeThis : Elapsed Time : 00:00:03.265 P:\test>timethis TM-nearFactor.pl 58 & timethis LR-nearFactor.pl 58 TimeThis : Command Line : TM-nearFactor.pl 58 TimeThis : Start Time : Sat Feb 26 14:38:50 2005 Result for 10000000000000000000000000000000000000000000000000000000000 +1 is 22665854592333022426775023 TimeThis : Command Line : TM-nearFactor.pl 58 TimeThis : Start Time : Sat Feb 26 14:38:50 2005 TimeThis : End Time : Sat Feb 26 14:39:07 2005 TimeThis : Elapsed Time : 00:00:17.578 TimeThis : Command Line : LR-nearFactor.pl 58 TimeThis : Start Time : Sat Feb 26 14:39:07 2005 Result for 10000000000000000000000000000000000000000000000000000000000 +1 is -1 TimeThis : Command Line : LR-nearFactor.pl 58 TimeThis : Start Time : Sat Feb 26 14:39:07 2005 TimeThis : End Time : Sat Feb 26 14:39:25 2005 TimeThis : Elapsed Time : 00:00:17.578

        The timings of these two particularly difficult runs, going by how long tall_man'code takes relative to most other runs like:

        P:\test>timethis TM-nearFactor.pl 57 & timethis LR-nearFactor.pl 57 TimeThis : Command Line : TM-nearFactor.pl 57 TimeThis : Start Time : Sat Feb 26 14:38:49 2005 Result for 10000000000000000000000000000000000000000000000000000000001 + is 846610559160061 TimeThis : Command Line : TM-nearFactor.pl 57 TimeThis : Start Time : Sat Feb 26 14:38:49 2005 TimeThis : End Time : Sat Feb 26 14:38:50 2005 TimeThis : Elapsed Time : 00:00:00.140 TimeThis : Command Line : LR-nearFactor.pl 57 TimeThis : Start Time : Sat Feb 26 14:38:50 2005 Result for 10000000000000000000000000000000000000000000000000000000001 + is -1 TimeThis : Command Line : LR-nearFactor.pl 57 TimeThis : Start Time : Sat Feb 26 14:38:50 2005 TimeThis : End Time : Sat Feb 26 14:38:50 2005 TimeThis : Elapsed Time : 00:00:00.109

        which despite being nearly as big a number, is solved in under a 1/5th of a second rather than 17+ seconds above. The very close similarity in timing between your code and tall_man's suggests that you are probably doing a very similar amount of work and possibly arriving at the same result, but for some reason, failing to display it?.

        Maybe a rounding error somewhere. I tried to follow this through--but as you know your code better than I, you might find the solution more quickly.

        Either way, your determination of the appropriate sequence of tests and short-ciruiting is very impressive. I congratulate you++ :).


        Examine what is said, not who speaks.
        Silence betokens consent.
        Love the truth but pardon error.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://434740]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others imbibing at the Monastery: (4)
As of 2024-04-20 03:42 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found