This meditation is about the comparision of two simple primality testing algorithms.
Here's how it all started.
demerphq raised the question of how you can get a small random prime. The answer to this is that you have to take a random number and check if it's a prime. Thus, we get to primality testings.
The following script has two primality testing algorithms. The first is a trivial one using divisions, the second one is the RabinMiller test. This script works only up to 2**31, and for such small numbers the simple one is much faster.
I am quite sure that none of these are the fastest ones available, so if you have a better solution, feel free to suggest it.
Update: I'm also not quite sure that my implementation of the RabinMiller test is correct. I did do reasonable efforts to test it, but some branches of it are rarely ran over, so there could be bugs that appear only very rarely. I hope it's correct though.
Here's the code,
#!perl
use warnings;
use strict;
use Inline "C", <<__ENDC;
/* mulmod(a, b, m) gives (a*b)%c */
long mulmod (long a, long b, long m) {
long long t = (long long)a * b;
return (long)(t % m);
}
__ENDC
{
# MillerRabin test
my @smallprimes = (
2, 3, 5, 7, 11, 13, 17, 19,
);
# primep($x) returns whether $x is a prime
sub primep_rm {
my($cand, $base, $p, $iter, $exp, $odd, $t, $x, $xprev);
$cand = 0 + $_[0];
1 < $cand && $cand == int($cand) or
die "error: invalid number to primep_rm: $cand";
for $p (@smallprimes) {
0 != $cand % $p or
return $p == $cand;
}
($exp, $odd) = (0, $cand  1);
while (0 == $odd % 2) {
($exp, $odd) = ($exp + 1, $odd >> 1);
}
for $iter (1 .. 52) {
$base = 1 + int(rand($cand  1));
$x = powmod($base, $odd, $cand);
SQUARING: for $t (1 .. $exp) {
($xprev, $x) = ($x, mulmod($x, $x, $cand));
$x == 1 and do {
$xprev == 1  $xprev == $cand  1 or
return;
last;
}
}
# KEY STEP: $x now contains $base**($cand  1) % $cand
# which should be 1 because of the Fermat theorem.
$x == 1 or
return;
}
return 1;
}
# powmod($a, $f, $m) calculates $a**$f % $m,
# precondition: 1 < $m, 0 < $f, $a are integers representable as a per
+l integer
sub powmod {
my($a, $f, $m) = @_;
my $r = 1;
for (;;) {
0 != ($f & 1) and
$r = mulmod($r, $a, $m);
$f >>= 1;
0 == $f and
last;
$a = mulmod($a, $a, $m);
}
$r;
}
}
{
# simple factoring
my @primes;
{
my($n, $p, $sqrp);
NLOOP: for $n (2 .. 100_100) {
$sqrp = sqrt($n) + 0.5;
for $p (@primes) {
$sqrp < $p and last;
0 == $n % $p and next NLOOP;
}
push @primes, $n
}
}
my $greatest_prime_squared = $primes[1]**2;
sub primep_div {
my $cand = 0 + $_[0];
my($p, $sqr);
1 < $cand && $cand == int($cand) or
die "error: invalid number to primep_div: $cand";
$cand < $greatest_prime_squared or
die "error: number too large in primep_div: $cand";
$sqr = sqrt($cand) + 0.5;
for $p (@primes) {
$p <= $sqr or
return 1;
0 == $cand % $p and
return $p == $cand;
}
die "internal error: shouldn't fall through here";
}
}
if (1) {
my $test = sub {
my($n, $p_rm, $p_div);
$n = $_[0];
$p_rm = !!primep_rm($n);
$p_div = !!primep_div($n);
$p_rm == $p_div or
die "error: primality tests don't match: $n";
};
my($n, $r, $l);
warn "begin serial tests";
for $n (2 .. 100000) {
&$test($n);
}
warn "begin random tests";
for $l (3 .. 9) {
for $r (1 .. 10000) {
$n = 2 + int(rand(10**$l));
&$test($n);
}
}
warn "all ok";
}
use Benchmark ":all";
if (1) {
warn "start benchmarks";
for my $len (3 .. 9) {
my($r_rm, $r_div);
my $ntries = $len < 8 ? 50_000 : 20_000;
$ntries /= 5;
my @tries = map {
10**($len  1) + 1 + 2 * int(rand((10**$len  10**($len  1))/2));
} 0 .. $ntries;
warn "benchmarking $len digit odd numbers, $ntries in total";
cmpthese(1, {
"primep_rm", sub {
my $r;
$r += primep_rm($_) ? 1 : 0 for
@tries;
$r_rm = $r;
},
"primep_div", sub {
my $r;
$r += primep_div($_) ? 1 : 0 for
@tries;
$r_div = $r;
},
});
$r_rm == $r_div or
die "error: primality tests don't agree in total";
}
}
__END__
The results show that the division algorithm is faster.
begin serial tests at p line 125.
begin random tests at p line 129.
all ok at p line 136.
start benchmarks at p line 143.
benchmarking 3 digit odd numbers, 10000 in total at p line 151.
(warning: too few iterations for a reliable count)
(warning: too few iterations for a reliable count)
benchmarking 4 digit odd numbers, 10000 in total at p line 151.
s/iter primep_rm primep_div
primep_rm 11.2  97%
primep_div 0.300 3637% 
(warning: too few iterations for a reliable count)
(warning: too few iterations for a reliable count)
benchmarking 5 digit odd numbers, 10000 in total at p line 151.
s/iter primep_rm primep_div
primep_rm 10.0  96%
primep_div 0.370 2614% 
(warning: too few iterations for a reliable count)
(warning: too few iterations for a reliable count)
benchmarking 6 digit odd numbers, 10000 in total at p line 151.
s/iter primep_rm primep_div
primep_rm 9.63  95%
primep_div 0.490 1865% 
(warning: too few iterations for a reliable count)
(warning: too few iterations for a reliable count)
benchmarking 7 digit odd numbers, 10000 in total at p line 151.
s/iter primep_rm primep_div
primep_rm 9.04  92%
primep_div 0.740 1122% 
(warning: too few iterations for a reliable count)
(warning: too few iterations for a reliable count)
benchmarking 8 digit odd numbers, 4000 in total at p line 151.
s/iter primep_rm primep_div
primep_rm 8.87  85%
primep_div 1.34 562% 
(warning: too few iterations for a reliable count)
(warning: too few iterations for a reliable count)
benchmarking 9 digit odd numbers, 4000 in total at p line 151.
s/iter primep_rm primep_div
primep_rm 3.36  68%
primep_div 1.08 211% 
(warning: too few iterations for a reliable count)
(warning: too few iterations for a reliable count)
s/iter primep_rm primep_div
primep_rm 3.41  30%
primep_div 2.37 44% 
Update: I forgot to mention that I took the RM algorihm from the
Cormen–Leiserson–Rivest–Stein book.
Re: Simple primality testing by demerphq (Chancellor) on Nov 22, 2005 at 23:26 UTC 
Well, about the onlything I can add is that I looked into Knuth's AoP and found algoithm P (1.3.2) which is for calculating the first N primes. I found it interesting as it was bit hard to convert the pseudocode* he had to perl. I switched around your code to do the same, and here they are:
sub make_primes_ambrus {
my $count=shift;
my @primes=(2);
my( $n, $p, $sqrp);
NLOOP:
for ($n=3; @primes<$count; $n+=2) {
$sqrp = sqrt($n) + 0.5;
for $p (@primes) {
$sqrp < $p and last;
0 == $n % $p and next NLOOP;
}
push @primes, $n
}
return \@primes;
}
sub make_primes_knuth {
my $count=shift;
my @primes=(2);
my $n=3;
PRIME:
while (@primes<$count) {
push @primes,$n;
while ( $n+=2 ) {
for my $p (@primes) {
my $r=$n % $p;
last if !$r;
# According to Knuth the proof of the vaildity
# of the following test "is interesting and a little
# unusual".
next PRIME if int($n/$p) <= $p;
}
}
}
return \@primes;
}
Incidentally, part of the issue for my original problem is that im generating a random prime relatively rarely, and as a fresh process every time. So the time (and hassle coding) to produce a table of primes to use for the div technique is not really necessary. Doing a brute force check on each random number in turn is just fine for me here.
* (Incidentally is there any easy way to get $Q and $R from $x/$y where Q is the quotient and R is the remainder? I used $Q=int($x/$y); and $R=$X % $Y; but i wonder if there is a better way, obviously repeated subtraction would leave an integer remainer somewhere, but given todays chip design is that actually more efficient than doing div and mod as seperate tasks? )

$world=~s/war/peace/g
 [reply] [d/l] 

I'm replying to your question about division an modulo
together.
In perl, there's no reason to do this,
as the opcodes to do the separating of the two results
would take much more time than what you'd gain from
calculating the two together.
However, this might not be the case with big integers,
and indeed, the Math::BigInt module defined the bdiv
method in such a way that it can store the remainder at
once. (Btw the iquo function of Maple works
the same way: it stores the other result.
This is not surprising as Maple handles big integers.)
In C, you can hope that the compiler notices that
you use division and modulus on the same number and will
optimize them to use just one operation if that helps.
If you don't trust the compiler, you can
also use the div (or ldiv, lldiv, imaxdiv) functions
of the C library, which return both the quotient
and the remainder. I think this doesn't have much merit
anyway, as if the compiler cannot do the optimization of
a division and a modulus together, then either
it wouldn't give any speed gain in the particular case
(eg because there's no free register to store the other
result) or your compiler is crap and you'd gain more
speed by installing a new compiler or fiddling with the
compiler flags than doing microoptimizations on division.
 [reply] [d/l] 
Re: Simple primality testing by TedYoung (Deacon) on Nov 22, 2005 at 21:41 UTC 
sub is_prime {
('1' x shift) !~ /^(11+)\1+$/
}
:)
Ted Young
($$<<$$=>$$<=>$$<=$$>>$$) always returns 1. :)
 [reply] [d/l] [select] 

 [reply] 
Re: Simple primality testing by Jenda (Abbot) on Nov 22, 2005 at 22:30 UTC 
Maybe I'm missing something, but if you want a "small" (whatever that means) random prime number then generating a random number, testing whether it's a prime and then generating tens and hundreds more until you accidentaly tramp over a prime sounds horribly inefficient to me. And actually not gauranteed to finish at all. Unless your "small" means something like "less than 20" primes are fairly sparse so on average you have to generate quite a few random numbers to find a single prime. For a reasonable definition of "small" you'd be much better off if you just pregenerate a list of all "small" primes and then keep selecting a random item from the list.
Jenda
XML sucks. Badly. SOAP on the other hand is the most powerfull vacuum pump ever invented. 
 [reply] 

primes are fairly sparse so on average you have to generate quite a few random numbers to find a single prime
The density of primes is high enough that this really isn't a big problem. The prime number theorem says that the density of primes around N is roughtly 1/ln(N). So if you want to find a kdigit prime, you should expect to find one after k tries on average. (You can of course be clever and only consider odd numbers, and then you'll find one after k/2 tries on average). So RabinMiller takes O(k) time on kbit numbers, and actually finding a prime takes O(k) tries on average. That's quite a good tradeoff actually.
I'm pretty certain this is how most modern crypto software generates RSA primes (guess and then check with RabinMiller). No, it's not super fast, but it's quite reasonable, and you usually only have to do it once.
 [reply] 

 [reply] 

Firstly, let me explain why random generating a prime
isn't that bad.
The first reason is the same as blokhead has mentioned,
that primes aren't so rare. As he has explained, if you
pick a random 31bit number, it's a prime with 1/21
probability.
The second reason is that even when we don't pick a prime,
in most of the cases, it is divisable with a small prime,
and in this case, the primality tester algorithm should
return false very quickly.
Note that even before the RM test, I divide the number
with the first eight primes (this should really be the
first 50 primes in realworld applications,
but then I wouldn't get much of the RM
algorithm here) first thing I do.
Even without those divisions, the RM test is correct,
so they wouldn't strictly be needed,
but it really helps finding a random prime number.
As a special case, blokhead notes that you can generate
a random even number. I do this, but it doesn't help much
as the primality tests test the parity of the number anyway.
(Update: it was stupid to say that I do this as I haven't
posted the code to generate a random prime. Anyway,
here it is:
end update)
Now let's look at whether building a table of large primes
is feasable, esp as spiritway has asked this too.
There are 5 million 8digit primes, so if we really needed
fivedigit primes, so we could build a table.
However, I'd only do this if you needed fivedigit primes
very often. (demerphq's original application
was a random hash seed, so I don't think he needs it that
often.)
Also, if you need 9digit primes or even primes less then
2^31, the table gets larger and larger.
 [reply] [d/l] 

My original requirements were for primes with 78 digits, or more generally, those that can be represented by a long.
testing whether it's a prime and then generating tens and hundreds more until you accidentaly tramp over a prime sounds horribly inefficient to me.
Surprisingly it doesnt take very long at all for my particular task. But you are right it isn't efficient. But efficiency in this particular case is not required at all. If the program takes a few seconds then fine, but as it is it finishes in a heartbeat.

$world=~s/war/peace/g
 [reply] 

I expected slightly smaller "small primes" :) And I stand corrected regarding the sparseness of primes. Thanks, you never stop learning :)
(For a minute I thought whether it wouldn't be better to generate just one random number and then keep subtracting/incrementing until you find a prime, so that the algorithm is garanteed to finish, but that's not a good idea. Primes are not evenly distributed so some would come up more often than others.)
Jenda
XML sucks. Badly. SOAP on the other hand is the most powerfull vacuum pump ever invented. 
 [reply] 
Re: Simple primality testing by gaal (Parson) on Nov 22, 2005 at 21:24 UTC 
The following Perl 6 implementation probably isn't the fastest you can find, but it certainly one of the shortest:
sub is_prime (Int $n) {
$n % all(2 .. sqrt($n)+1);
}
Credit: Autrijus in the Apocalypse Now talk.  [reply] [d/l] 

 [reply] [d/l] 

You are correct. Nice catch!
 [reply] 

 [reply] [d/l] 
Re: Simple primality testing by spiritway (Vicar) on Nov 23, 2005 at 05:49 UTC 
You might consider generating a list of primes below a given number, and keeping it on disk. Then randomly access one of those numbers. I don't know exactly how many primes are less than 10**8 or so, but it's not altogether overwhelming.
 [reply] 

 [reply] 

#!/usr/bin/perl w
# a not too bad approximation to the number of primes
# less than N is approximately N/log(N). Better methods
# exist, but none are as simple...
my $n1 = 10**8;
my $n2 = 2**32;
$\="\n";
# ~ 5428681
print "Number of primes less than $n1 is ",int($n1/log($n1));
# ~ 193635250
print "Number of primes less than $n2 is ",int($n2/log($n2));
 [reply] [d/l] 

Actually since these are signed longs it only goes up to 2^311, which is a prime, a Mersenne Prime actually. (For those that missed this bit, a Mersenne prime is any prime expressable in the form 2^N1.)
I dont know if I'm the only one that finds it interesting that the highest number you can represent with a long is a prime, but I do. :)
Now if only they would use questions like this on pubquizzes. :)

$world=~s/war/peace/g
 [reply] 

Re: Simple primality testing by ambrus (Abbot) on Oct 20, 2007 at 19:18 UTC 
I used Inline::C for a 64bit integer calculation that couldn't easily be done on a perl compiled with 32bit integers (the precision of doubles wouldn't be enough). Obviously, Math::BigInt or its fast backend Math::BigInt::GMP would work here. However, salva just mentioned his module Math::Int64 which I think would fit here exactly.
 [reply] 
Re: Simple primality testing by ambrus (Abbot) on Feb 09, 2009 at 16:53 UTC 
 [reply] 
Re: Simple primality testing by JavaFan (Canon) on Jun 18, 2009 at 21:50 UTC 
If I want a small random prime, I do
my $random_prime = [17, 23]>[int rand 2];
After all, if math professors need a prime number for an example, 95% of them pick either 17 or 23.  [reply] [d/l] 

 [reply] 

No, no, no, that's the "hard to factor" product of two random primes.
 [reply] 

