http://www.perlmonks.org?node_id=610221

in reply to ulam's spiral too slow

Hi orange,

The main problem you're having is in your is_prime subroutine.

You are calculating primes based on whether a number N is divisible by any value from 1 to N, when you only need to be looking at all odd values between 3 and sqrt(N), but including the special case of 2.  Additionally, you should be caching the values (you may have thought you were doing this with my @prime_;, but the correct way would be to use a hash.  (Additionally, though you were checking the value with if (\$prime_[1] == \$num) { \$b = 1; }, you were still letting the entire loop run its course, and thus defeating its purpose).

Update:  But see grinder's important comment, below, on an even further degree of optimization.

Update:  bart caught a bug in my program; namely, that it needs to test for division by 2!  I've added this in the code below.

Here's a rewrite of the entire program:

```#!/usr/local/bin/perl

use strict;
use warnings;

use Tk;

my \$NumOfPrimes = 0;
my (\$colr, \$counter);
my(\$o, \$s) = (250, 20);
my(\$pi, \$x, \$y) = (3.1415926, 0, 0);
print "type the maximum number for the prime spiral \n";
my \$lastnumber = <>;
print "please wait \n";
my \$mw = MainWindow->new;
my \$c = \$mw->Canvas(-width => 500, -height => 500);
\$c->pack;
\$c->createLine(50, 250, 450, 250);
\$c->createText(10, 250, -fill => 'blue', -text => 'X');
\$c->createLine(250, 50, 250, 450);
\$c->createText(250, 10, -fill => 'blue', -text => 'Y');

my \$num      = 0;
my \$order    = 0;
my \$pnts     = 0;
my \$pntcycle = 0;
OUTER:
for (my \$i= 1; \$i <= 250; \$i += 1) {
\$order=\$order+1;
if (\$order == 5) {\$order = 1;}
++\$pntcycle;
if (\$pntcycle == 3) {
\$pntcycle = 1;
++\$pnts;
}
\$counter = 0;
while (\$counter <= \$pnts) {
\$counter++;
++\$num;     # This is the more common way to increment a n
+umber
if (\$num == \$lastnumber + 1) { last OUTER }

my \$b = is_prime(\$num);
if (\$b != 0) {\$colr = "white"; \$NumOfPrimes = \$NumOfPrimes
+ + 1;}
if (\$b == 0) {\$colr = "black";}

if (\$order == 1) {\$x = \$x + 1;}
elsif (\$order == 2) {\$y = \$y - 1;}
elsif (\$order == 3) {\$x = \$x - 1; }
elsif (\$order == 4) {\$y = \$y + 1; }

\$c->createText( \$x+\$o, \$y+\$o, -fill => "\$colr", -text => '
+.');
}
}
MainLoop;

BEGIN {
# Cache primes for future use, but using a *hash*, NOT an array!
# Start with the value "2", which lets us avoid
my %primes = ( 2 => 1 );

sub is_prime {
my (\$num) = @_;

if (exists(\$primes{\$num})) {
return \$primes{\$num};
}

# Only need to calculate up to sqrt(\$num).
# This is VERY important, especially with larger numbers.
# If N is divisible by some factor \$f, less than sqrt(\$N),
# then it's also divisible by some other factor \$k = \$N/\$f
# which MUST be larger than sqrt(\$N), and vice versa.
my \$sqrt = int(sqrt(\$num));

# Check for division by 2 (special-case)
(\$num % 2) or return 0;

# We've tried 2 as a factor, so any other factors must be odd
# (and prime, for that matter).
for (my \$j = 3; \$j <= \$sqrt; \$j += 2) {
if (0 == \$num % \$j) {
# It's NOT a prime.
return 0;
}
}
# Cache the prime number and return '1'
return \$primes{\$num} = 1;
}
}
[download]```

A couple of other points ...

1. You should use warnings, to get lots of valuable debugging information, like the fact that \$lastnumber was declared twice.
2. my (\$order,\$pntcycle,\$pnts,\$b,\$num,\$p) = (0); likely does NOT do what you want.  It's only setting the value of \$order to zero.  To set them all equal to zero, do something like my \$order = \$my \$pntcycle = my \$pnts = ... = my \$p = 0;, or individually (the way I did above).
3. It's not generally a good idea to make everything global, especially values into and out of a subroutine.  You really should be passing \$num to is_prime(), and returning the value which you assign to \$b.
4. You don't need to quote numbers when you put them into a hash -- \$prime_[\$p] = "\$j"; is better written \$prime_[\$p] = \$j.
5. It's more "Perl-like" to increment a variable with ++\$var.  \$var = \$var + 1 looks too much like, well, BASIC. :-)
6. Many will find your code easier to read if you indent it.

s''(q.S:\$/9=(T1';s;(..)(..);\$..=substr+crypt(\$1,\$2),2,3;eg;print\$..\$/

Replies are listed 'Best First'.
Re^2: ulam's spiral too slow
by grinder (Bishop) on Apr 15, 2007 at 20:28 UTC
you only need to be looking at all values between 2 and sqrt(N)

I was about to tick you off, for not suggesting examining only odd numbers, but in fact your code does exactly that. Nonetheless, the fact that you are caching previously discovered primes admits an elegant optimisation:

you only need to be looking at all prime numbers between 2 and sqrt(N).

Retooling your most excellent code is left as an exercise to the reader :)

• another intruder with the mooring in the heart of the Perl

++grinder, an excellent point.

My laziness in not optimizing it further has been revealed.

As for counting by twos, I certainly could claim that my background gets the credit (I was a mathematics major), but I have to confess that many years ago I was the victim of a similar faux pas.

I was taking a Pascal course in college (my first structure programming language -- best language ever -- until I learned "C" a few years later).  The term project I had chosen was to create a program that generated patterns from prime numbers.  I wrote a similar is_prime subroutine which counted by ones, and a friend who looked at my code said "you realize of course...".  Whoops!

s''(q.S:\$/9=(T1';s;(..)(..);\$..=substr+crypt(\$1,\$2),2,3;eg;print\$..\$/
Re^2: ulam's spiral too slow
by graff (Chancellor) on Apr 15, 2007 at 21:29 UTC
As grinder points out, since the primes are being found in ascending order, it would make sense (and go really fast) to limit the for loop to just the hash keys that are the primes found so far.

But there's another issue in comparing the OP code with liverpole's revision: the outputs are different.

If I add  warn "\$x \$y \$colr\n"; after the call to "createText" in each version, then run each of them with a "maximum number" value of 6000 and redirect stderr to a separate file for each version, both scripts produce 6000 lines of output, and the x and y values match perfectly line-by-line in the two files. But the color values are quite different, starting at the first line.

I wanted to check that, because it seemed to me that liverpole's faster version was not as "dark" as the OP version, and sure enough, the optimized/faster script produces 1694 white pixels, 4306 black ones, in contrast to the OP code, which produces only 783 white, and 5217 black. The two versions disagree on color in 911 of the 6000 outputs. (That is, all the disagreements involve points where the OP said "black" and liverpole said "white".)

It probably involves the strange use of  \$prime_[1] in the OP (where it probably should have been "\$p" instead of "1", but it's hard to be sure what the intent was there). update: I had assumed (probably wrongly) that the OP code was drawing primes in black, but my tabulation showed 783 white dots drawn for a max value of 6000, and that turns out to be the correct number of primes in that range. Sorry liverpole, it looks like there's a problem in your math somewhere.

Re^2: ulam's spiral too slow
by bart (Canon) on Apr 16, 2007 at 10:42 UTC
Your is_prime function claims that 4 is a prime number. Would you please please please check on even-ness of a number, before you even start going through the whole checking loop?
Thanks for catching that.  I just noticed the bug in my code, which is:  although I'm caching prime numbers, I'm not testing all of them as potential factors.

I'll go update the code now, but you're right; there needs to be a check for division by 2.

s''(q.S:\$/9=(T1';s;(..)(..);\$..=substr+crypt(\$1,\$2),2,3;eg;print\$..\$/
thanks, your code is much faster than mine, and now it produce the same figure as in the
http://en.wikipedia.org/wiki/Ulam_spiral
after we invert the colors
one small note is that your code will plot number 1 as a prime number, we can find this if we insert
print "number of primes = \$NumOfPrimes \n"; before the mainloop; and try the program with number 10 , it will give us the number of primes is 5
as a variation i suggest for the invistigators for small spirals to replace the main plotting code:
\$c->createText( \$x*+\$o, \$y+\$o, -fill => "\$colr", -text => '.'); with
\$c->createText( \$x*30+\$o, \$y*30+\$o, -fill => "\$colr", -text => "\$num"); so we can see numbers instead of dots in wich the primes will be in white color , the factor 30 above is for magnification of the plotting.
Re^2: ulam's spiral too slow
by dokkeldepper (Friar) on Apr 16, 2007 at 07:23 UTC
considering
```is_prime
[download]```

In general it is much faster to generate an array of ones and fill it muliplicatively with zeroes. All ones left indicate a prime. Note that multiplication is usually much faster than division and it provides a kind of build-in cache.
Re^2: ulam's spiral too slow
by smahesh (Pilgrim) on Apr 16, 2007 at 07:03 UTC

Can you (or any fellow monk) please clarify the need for caching the prime numbers in this particular scenario?

The is_prime() check for the current \$num does not depend on previous values (1..\$num-1). I can understand the benefits of caching in other cases like fibonacci series where F(n) = F(n-1) + F(n-2). The OP is interating through N (e.g. 5000) points (\$iter = 1..N) and checking is_prime(\$iter) on each value. The \$iter values are unique and is_prime() is called once for each unique value. The data is plotted immediately and the is_prime'ness is not used further in the code.

While it is beneficial to cache the computed primes, in this particular scenario (OP's post), it is not providing any advantages. On the contrary, it ends up increasing memory usage of the application

Mahesh

Can you (or any fellow monk) please clarify the need for caching the prime numbers in this particular scenario?
I know why I would cache it.

You are familiar with the optimization for check primality by only testing with odd numbers, which has been pointed out several times already in this thread. You can optimize even more by only testing division with prime numbers. It's actually the same kind of optimization: apart from 2, all even numbers are not prime (hence, needn't be tested) because they're all divisible by 2.

This extra optimization can reduce the number of tests you need to do significantly, especially for large numbers. You are already calculating all lower prime numbers in sequence, anyway. So you've already done the footwork. All you still have to do is store the found values. And change the code so it makes use of that.

And liverpole's code doesn't make use of that. Tsk.

I currently don't have the time to build a full-fledged memoized version of is_prime (but I might come back to it later). Maybe somebody else may feel the inclination...? It's not easy, you need to keep track of upto what level you have already calculated all primes, and fall back to liverpole's code if you're above that level, possibly raising the level if the step is small enough, i.e. you just checked the next odd number.

And, uh, I suspect liverpole found himself in the same situation, and that he gave up on it, too. In that scenarion, the cache of the primes may just be a leftover.

Excellent. Based on your clarification, one can indeed use the divisibility check on previous computed primes to determine if current \$number is prime.

You can learn non-Perl stuff on perlmonks too!! :-)

Thanks, Mahesh