Beefy Boxes and Bandwidth Generously Provided by pair Networks
XP is just a number
 
PerlMonks  

Simple primality testing

by ambrus (Abbot)
on Nov 22, 2005 at 20:53 UTC ( #510901=perlmeditation: print w/ replies, xml ) Need Help??

This meditation is about the comparision of two simple primality testing algorithms.

Here's how it all started. demerphq raised the question of how you can get a small random prime. The answer to this is that you have to take a random number and check if it's a prime. Thus, we get to primality testings.

The following script has two primality testing algorithms. The first is a trivial one using divisions, the second one is the Rabin-Miller test. This script works only up to 2**31, and for such small numbers the simple one is much faster.

I am quite sure that none of these are the fastest ones available, so if you have a better solution, feel free to suggest it.

Update: I'm also not quite sure that my implementation of the Rabin-Miller test is correct. I did do reasonable efforts to test it, but some branches of it are rarely ran over, so there could be bugs that appear only very rarely. I hope it's correct though.

Here's the code,

#!perl use warnings; use strict; use Inline "C", <<__ENDC; /* mulmod(a, b, m) gives (a*b)%c */ long mulmod (long a, long b, long m) { long long t = (long long)a * b; return (long)(t % m); } __ENDC { # Miller--Rabin test my @smallprimes = ( 2, 3, 5, 7, 11, 13, 17, 19, );
# primep($x) returns whether $x is a prime sub primep_rm { my($cand, $base, $p, $iter, $exp, $odd, $t, $x, $xprev); $cand = 0 + $_[0]; 1 < $cand && $cand == int($cand) or die "error: invalid number to primep_rm: $cand"; for $p (@smallprimes) { 0 != $cand % $p or return $p == $cand; } ($exp, $odd) = (0, $cand - 1); while (0 == $odd % 2) { ($exp, $odd) = ($exp + 1, $odd >> 1); } for $iter (1 .. 52) { $base = 1 + int(rand($cand - 1)); $x = powmod($base, $odd, $cand); SQUARING: for $t (1 .. $exp) { ($xprev, $x) = ($x, mulmod($x, $x, $cand)); $x == 1 and do { $xprev == 1 || $xprev == $cand - 1 or return; last; } } # KEY STEP: $x now contains $base**($cand - 1) % $cand # which should be 1 because of the Fermat theorem. $x == 1 or return; } return 1; }
# powmod($a, $f, $m) calculates $a**$f % $m, # precondition: 1 < $m, 0 < $f, $a are integers representable as a per +l integer sub powmod { my($a, $f, $m) = @_; my $r = 1; for (;;) { 0 != ($f & 1) and $r = mulmod($r, $a, $m); $f >>= 1; 0 == $f and last; $a = mulmod($a, $a, $m); } $r; } } { # simple factoring my @primes; { my($n, $p, $sqrp); NLOOP: for $n (2 .. 100_100) { $sqrp = sqrt($n) + 0.5; for $p (@primes) { $sqrp < $p and last; 0 == $n % $p and next NLOOP; } push @primes, $n } } my $greatest_prime_squared = $primes[-1]**2;
sub primep_div { my $cand = 0 + $_[0]; my($p, $sqr); 1 < $cand && $cand == int($cand) or die "error: invalid number to primep_div: $cand"; $cand < $greatest_prime_squared or die "error: number too large in primep_div: $cand"; $sqr = sqrt($cand) + 0.5; for $p (@primes) { $p <= $sqr or return 1; 0 == $cand % $p and return $p == $cand; } die "internal error: shouldn't fall through here"; }
} if (1) { my $test = sub { my($n, $p_rm, $p_div); $n = $_[0]; $p_rm = !!primep_rm($n); $p_div = !!primep_div($n); $p_rm == $p_div or die "error: primality tests don't match: $n"; }; my($n, $r, $l); warn "begin serial tests"; for $n (2 .. 100000) { &$test($n); } warn "begin random tests"; for $l (3 .. 9) { for $r (1 .. 10000) { $n = 2 + int(rand(10**$l)); &$test($n); } } warn "all ok"; } use Benchmark ":all"; if (1) { warn "start benchmarks"; for my $len (3 .. 9) { my($r_rm, $r_div); my $ntries = $len < 8 ? 50_000 : 20_000; $ntries /= 5; my @tries = map { 10**($len - 1) + 1 + 2 * int(rand((10**$len - 10**($len - 1))/2)); } 0 .. $ntries; warn "benchmarking $len digit odd numbers, $ntries in total"; cmpthese(1, { "primep_rm", sub { my $r; $r += primep_rm($_) ? 1 : 0 for @tries; $r_rm = $r; }, "primep_div", sub { my $r; $r += primep_div($_) ? 1 : 0 for @tries; $r_div = $r; }, }); $r_rm == $r_div or die "error: primality tests don't agree in total"; } } __END__

The results show that the division algorithm is faster.

begin serial tests at p line 125. begin random tests at p line 129. all ok at p line 136. start benchmarks at p line 143. benchmarking 3 digit odd numbers, 10000 in total at p line 151. (warning: too few iterations for a reliable count) (warning: too few iterations for a reliable count) benchmarking 4 digit odd numbers, 10000 in total at p line 151. s/iter primep_rm primep_div primep_rm 11.2 -- -97% primep_div 0.300 3637% -- (warning: too few iterations for a reliable count) (warning: too few iterations for a reliable count) benchmarking 5 digit odd numbers, 10000 in total at p line 151. s/iter primep_rm primep_div primep_rm 10.0 -- -96% primep_div 0.370 2614% -- (warning: too few iterations for a reliable count) (warning: too few iterations for a reliable count) benchmarking 6 digit odd numbers, 10000 in total at p line 151. s/iter primep_rm primep_div primep_rm 9.63 -- -95% primep_div 0.490 1865% -- (warning: too few iterations for a reliable count) (warning: too few iterations for a reliable count) benchmarking 7 digit odd numbers, 10000 in total at p line 151. s/iter primep_rm primep_div primep_rm 9.04 -- -92% primep_div 0.740 1122% -- (warning: too few iterations for a reliable count) (warning: too few iterations for a reliable count) benchmarking 8 digit odd numbers, 4000 in total at p line 151. s/iter primep_rm primep_div primep_rm 8.87 -- -85% primep_div 1.34 562% -- (warning: too few iterations for a reliable count) (warning: too few iterations for a reliable count) benchmarking 9 digit odd numbers, 4000 in total at p line 151. s/iter primep_rm primep_div primep_rm 3.36 -- -68% primep_div 1.08 211% -- (warning: too few iterations for a reliable count) (warning: too few iterations for a reliable count) s/iter primep_rm primep_div primep_rm 3.41 -- -30% primep_div 2.37 44% --

Update: I forgot to mention that I took the R-M algorihm from the Cormen–Leiserson–Rivest–Stein book.

Comment on Simple primality testing
Select or Download Code
Re: Simple primality testing
by gaal (Parson) on Nov 22, 2005 at 21:24 UTC

      I don't have perl6 handy, but wouldn't that fail for 2? In that case, sqrt(1) + 1 is greater then 2 so 2 is included in the range.

        You are correct. Nice catch!
Re: Simple primality testing
by TedYoung (Deacon) on Nov 22, 2005 at 21:41 UTC

    My favourite way to test primality is:

    sub is_prime { ('1' x shift) !~ /^(11+)\1+$/ }

    :-)

    Ted Young

    ($$<<$$=>$$<=>$$<=$$>>$$) always returns 1. :-)
Re: Simple primality testing
by Jenda (Abbot) on Nov 22, 2005 at 22:30 UTC

    Maybe I'm missing something, but if you want a "small" (whatever that means) random prime number then generating a random number, testing whether it's a prime and then generating tens and hundreds more until you accidentaly tramp over a prime sounds horribly inefficient to me. And actually not gauranteed to finish at all. Unless your "small" means something like "less than 20" primes are fairly sparse so on average you have to generate quite a few random numbers to find a single prime. For a reasonable definition of "small" you'd be much better off if you just pregenerate a list of all "small" primes and then keep selecting a random item from the list.

    Jenda
    XML sucks. Badly. SOAP on the other hand is the most powerfull vacuum pump ever invented.

      My original requirements were for primes with 7-8 digits, or more generally, those that can be represented by a long.

      testing whether it's a prime and then generating tens and hundreds more until you accidentaly tramp over a prime sounds horribly inefficient to me.

      Surprisingly it doesnt take very long at all for my particular task. But you are right it isn't efficient. But efficiency in this particular case is not required at all. If the program takes a few seconds then fine, but as it is it finishes in a heartbeat.

      ---
      $world=~s/war/peace/g

        I expected slightly smaller "small primes" :-) And I stand corrected regarding the sparseness of primes. Thanks, you never stop learning :-)

        (For a minute I thought whether it wouldn't be better to generate just one random number and then keep subtracting/incrementing until you find a prime, so that the algorithm is garanteed to finish, but that's not a good idea. Primes are not evenly distributed so some would come up more often than others.)

        Jenda
        XML sucks. Badly. SOAP on the other hand is the most powerfull vacuum pump ever invented.

      primes are fairly sparse so on average you have to generate quite a few random numbers to find a single prime
      The density of primes is high enough that this really isn't a big problem. The prime number theorem says that the density of primes around N is roughtly 1/ln(N). So if you want to find a k-digit prime, you should expect to find one after k tries on average. (You can of course be clever and only consider odd numbers, and then you'll find one after k/2 tries on average). So Rabin-Miller takes O(k) time on k-bit numbers, and actually finding a prime takes O(k) tries on average. That's quite a good tradeoff actually.

      I'm pretty certain this is how most modern crypto software generates RSA primes (guess and then check with Rabin-Miller). No, it's not super fast, but it's quite reasonable, and you usually only have to do it once.

      blokhead

        From the Prime number theorem it will take k tries on average for a k digit number only if you are working in base e and looking at numbers right around e**k. For mortals who work in base 10, it takes an average of about 2.3*k+1 tries or so.

        Things like being clever about odds works from this starting point.

      Firstly, let me explain why random generating a prime isn't that bad.

      The first reason is the same as blokhead has mentioned, that primes aren't so rare. As he has explained, if you pick a random 31-bit number, it's a prime with 1/21 probability.

      The second reason is that even when we don't pick a prime, in most of the cases, it is divisable with a small prime, and in this case, the primality tester algorithm should return false very quickly. Note that even before the R-M test, I divide the number with the first eight primes (this should really be the first 50 primes in real-world applications, but then I wouldn't get much of the R-M algorithm here) first thing I do. Even without those divisions, the R-M test is correct, so they wouldn't strictly be needed, but it really helps finding a random prime number. As a special case, blokhead notes that you can generate a random even number. I do this, but it doesn't help much as the primality tests test the parity of the number anyway. (Update: it was stupid to say that I do this as I haven't posted the code to generate a random prime. Anyway, here it is:

      end update)

      Now let's look at whether building a table of large primes is feasable, esp as spiritway has asked this too. There are 5 million 8-digit primes, so if we really needed five-digit primes, so we could build a table. However, I'd only do this if you needed five-digit primes very often. (demerphq's original application was a random hash seed, so I don't think he needs it that often.) Also, if you need 9-digit primes or even primes less then 2^31, the table gets larger and larger.

Re: Simple primality testing
by demerphq (Chancellor) on Nov 22, 2005 at 23:26 UTC

    Well, about the onlything I can add is that I looked into Knuth's AoP and found algoithm P (1.3.2) which is for calculating the first N primes. I found it interesting as it was bit hard to convert the pseudocode* he had to perl. I switched around your code to do the same, and here they are:

    sub make_primes_ambrus { my $count=shift; my @primes=(2); my( $n, $p, $sqrp); NLOOP: for ($n=3; @primes<$count; $n+=2) { $sqrp = sqrt($n) + 0.5; for $p (@primes) { $sqrp < $p and last; 0 == $n % $p and next NLOOP; } push @primes, $n } return \@primes; } sub make_primes_knuth { my $count=shift; my @primes=(2); my $n=3; PRIME: while (@primes<$count) { push @primes,$n; while ( $n+=2 ) { for my $p (@primes) { my $r=$n % $p; last if !$r; # According to Knuth the proof of the vaildity # of the following test "is interesting and a little # unusual". next PRIME if int($n/$p) <= $p; } } } return \@primes; }

    Incidentally, part of the issue for my original problem is that im generating a random prime relatively rarely, and as a fresh process every time. So the time (and hassle coding) to produce a table of primes to use for the div technique is not really necessary. Doing a brute force check on each random number in turn is just fine for me here.

    * (Incidentally is there any easy way to get $Q and $R from $x/$y where Q is the quotient and R is the remainder? I used $Q=int($x/$y); and $R=$X % $Y; but i wonder if there is a better way, obviously repeated subtraction would leave an integer remainer somewhere, but given todays chip design is that actually more efficient than doing div and mod as seperate tasks? )

    ---
    $world=~s/war/peace/g

      I'm replying to your question about division an modulo together.

      In perl, there's no reason to do this, as the opcodes to do the separating of the two results would take much more time than what you'd gain from calculating the two together. However, this might not be the case with big integers, and indeed, the Math::BigInt module defined the bdiv method in such a way that it can store the remainder at once. (Btw the iquo function of Maple works the same way: it stores the other result. This is not surprising as Maple handles big integers.)

      In C, you can hope that the compiler notices that you use division and modulus on the same number and will optimize them to use just one operation if that helps. If you don't trust the compiler, you can also use the div (or ldiv, lldiv, imaxdiv) functions of the C library, which return both the quotient and the remainder. I think this doesn't have much merit anyway, as if the compiler cannot do the optimization of a division and a modulus together, then either it wouldn't give any speed gain in the particular case (eg because there's no free register to store the other result) or your compiler is crap and you'd gain more speed by installing a new compiler or fiddling with the compiler flags than doing micro-optimizations on division.

Re: Simple primality testing
by spiritway (Vicar) on Nov 23, 2005 at 05:49 UTC

    You might consider generating a list of primes below a given number, and keeping it on disk. Then randomly access one of those numbers. I don't know exactly how many primes are less than 10**8 or so, but it's not altogether overwhelming.

      #!/usr/bin/perl -w # a not too bad approximation to the number of primes # less than N is approximately N/log(N). Better methods # exist, but none are as simple... my $n1 = 10**8; my $n2 = 2**32; $\="\n"; # ~ 5428681 print "Number of primes less than $n1 is ",int($n1/log($n1)); # ~ 193635250 print "Number of primes less than $n2 is ",int($n2/log($n2));

        Actually since these are signed longs it only goes up to 2^31-1, which is a prime, a Mersenne Prime actually. (For those that missed this bit, a Mersenne prime is any prime expressable in the form 2^N-1.)

        I dont know if I'm the only one that finds it interesting that the highest number you can represent with a long is a prime, but I do. :-)

        Now if only they would use questions like this on pub-quizzes. :-)

        ---
        $world=~s/war/peace/g

Re: Simple primality testing
by ambrus (Abbot) on Oct 20, 2007 at 19:18 UTC

    I used Inline::C for a 64-bit integer calculation that couldn't easily be done on a perl compiled with 32-bit integers (the precision of doubles wouldn't be enough). Obviously, Math::BigInt or its fast backend Math::BigInt::GMP would work here. However, salva just mentioned his module Math::Int64 which I think would fit here exactly.

Re: Simple primality testing
by ambrus (Abbot) on Feb 09, 2009 at 16:53 UTC
Re: Simple primality testing
by JavaFan (Canon) on Jun 18, 2009 at 21:50 UTC
    If I want a small random prime, I do
    my $random_prime = [17, 23]->[int rand 2];
    After all, if math professors need a prime number for an example, 95% of them pick either 17 or 23.

      I thought the canonical example prime number was 91.

        No, no, no, that's the "hard to factor" product of two random primes.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlmeditation [id://510901]
Approved by neniro
Front-paged by grinder
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others taking refuge in the Monastery: (10)
As of 2014-08-01 17:14 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    Who would be the most fun to work for?















    Results (33 votes), past polls