 Think about Loose Coupling PerlMonks

ulam's spiral too slow

 on Apr 15, 2007 at 19:11 UTC Need Help??

orange has asked for the wisdom of the Perl Monks concerning the following question:

hi
ulam spiral is the prime numbers spiral, you can find info how to draw it in:
Ulam_spiral

each white dot represent a prime number. in the result we will see that the primes have some limited patterns. the following code draw the spiral (a square spiral),

#!/usr/local/bin/perl use strict; use Tk; my (\$order,\$pntcycle,\$pnts,\$b,\$num,\$p) = (0); my \$lastnumber; my \$NumOfPrimes=0; my (\$i, \$colr, \$counter); my(\$o, \$s) = (250, 20); my(\$pi, \$x, \$y) = (3.1415926, 0, 0); print "type the maximum number for the prime spiral \n"; my \$lastnumber = <>; print "please wait \n"; my \$mw = MainWindow->new; my \$c = \$mw->Canvas(-width => 500, -height => 500); \$c->pack; \$c->createLine(50, 250, 450, 250); \$c->createText(10, 250, -fill => 'blue', -text => 'X'); \$c->createLine(250, 50, 250, 450); \$c->createText(250, 10, -fill => 'blue', -text => 'Y'); OUTER: for (\$i= 1; \$i <= 250; \$i += 1) { \$order=\$order+1; if (\$order == 5){\$order = 1;} \$pntcycle = \$pntcycle + 1; if (\$pntcycle == 3) {\$pntcycle = 1; \$pnts = \$pnts + 1;} \$counter=0; while (\$counter <= \$pnts ) { \$counter++; \$num = \$num + 1; if (\$num == \$lastnumber+1) {last OUTER;} &is_prime(); if (\$b != 0) {\$colr = "white"; \$NumOfPrimes = \$NumOfPrimes + 1;} if (\$b == 0) {\$colr = "black";} if (\$order == 1) {\$x = \$x + 1;} elsif (\$order == 2) {\$y = \$y - 1;} elsif (\$order == 3) {\$x = \$x - 1; } elsif (\$order == 4) {\$y = \$y + 1; } \$c->createText( \$x+\$o, \$y+\$o, -fill => "\$colr", -text => '.'); } } MainLoop; sub is_prime { \$p=0; \$b=0; my \$j=0; my @prime_; for(\$j=1; \$j <= \$num; \$j++) { if(\$num % \$j==0) { \$prime_[\$p] = "\$j"; \$p++; } if (\$prime_ == \$num) { \$b = 1; } } }
but i have two problems, it is very slow, in my celeron 2 Ghz i draw the points up to 5000 in about 20 seconds. the algorithm for the checking the primality of numbers i have copied it from somewhere in this forum with a small variation.
the other problem is that the final plot appear after the calculation have finished, i can't see the process of drawing dot by dot.

for comparison here is a visual basic 6 code and an executable file to demonstrate the defference:
http://mihd.net/z58lv6
i am using ActivePerl-5.8.8.817-MSWin32-x86
windows xp

Replies are listed 'Best First'.
Re: ulam's spiral too slow
by liverpole (Monsignor) on Apr 15, 2007 at 20:05 UTC
Hi orange,

The main problem you're having is in your is_prime subroutine.

You are calculating primes based on whether a number N is divisible by any value from 1 to N, when you only need to be looking at all odd values between 3 and sqrt(N), but including the special case of 2.  Additionally, you should be caching the values (you may have thought you were doing this with my @prime_;, but the correct way would be to use a hash.  (Additionally, though you were checking the value with if (\$prime_ == \$num) { \$b = 1; }, you were still letting the entire loop run its course, and thus defeating its purpose).

Update:  But see grinder's important comment, below, on an even further degree of optimization.

Update:  bart caught a bug in my program; namely, that it needs to test for division by 2!  I've added this in the code below.

Here's a rewrite of the entire program:

#!/usr/local/bin/perl use strict; use warnings; use Tk; my \$NumOfPrimes = 0; my (\$colr, \$counter); my(\$o, \$s) = (250, 20); my(\$pi, \$x, \$y) = (3.1415926, 0, 0); print "type the maximum number for the prime spiral \n"; my \$lastnumber = <>; print "please wait \n"; my \$mw = MainWindow->new; my \$c = \$mw->Canvas(-width => 500, -height => 500); \$c->pack; \$c->createLine(50, 250, 450, 250); \$c->createText(10, 250, -fill => 'blue', -text => 'X'); \$c->createLine(250, 50, 250, 450); \$c->createText(250, 10, -fill => 'blue', -text => 'Y'); my \$num = 0; my \$order = 0; my \$pnts = 0; my \$pntcycle = 0; OUTER: for (my \$i= 1; \$i <= 250; \$i += 1) { \$order=\$order+1; if (\$order == 5) {\$order = 1;} ++\$pntcycle; if (\$pntcycle == 3) { \$pntcycle = 1; ++\$pnts; } \$counter = 0; while (\$counter <= \$pnts) { \$counter++; ++\$num; # This is the more common way to increment a n +umber if (\$num == \$lastnumber + 1) { last OUTER } my \$b = is_prime(\$num); if (\$b != 0) {\$colr = "white"; \$NumOfPrimes = \$NumOfPrimes + + 1;} if (\$b == 0) {\$colr = "black";} if (\$order == 1) {\$x = \$x + 1;} elsif (\$order == 2) {\$y = \$y - 1;} elsif (\$order == 3) {\$x = \$x - 1; } elsif (\$order == 4) {\$y = \$y + 1; } \$c->createText( \$x+\$o, \$y+\$o, -fill => "\$colr", -text => ' +.'); } } MainLoop; BEGIN { # Cache primes for future use, but using a *hash*, NOT an array! # Start with the value "2", which lets us avoid my %primes = ( 2 => 1 ); sub is_prime { my (\$num) = @_; if (exists(\$primes{\$num})) { return \$primes{\$num}; } # Only need to calculate up to sqrt(\$num). # This is VERY important, especially with larger numbers. # If N is divisible by some factor \$f, less than sqrt(\$N), # then it's also divisible by some other factor \$k = \$N/\$f # which MUST be larger than sqrt(\$N), and vice versa. my \$sqrt = int(sqrt(\$num)); # Check for division by 2 (special-case) (\$num % 2) or return 0; # We've tried 2 as a factor, so any other factors must be odd # (and prime, for that matter). for (my \$j = 3; \$j <= \$sqrt; \$j += 2) { if (0 == \$num % \$j) { # It's NOT a prime. return 0; } } # Cache the prime number and return '1' return \$primes{\$num} = 1; } }

A couple of other points ...

1. You should use warnings, to get lots of valuable debugging information, like the fact that \$lastnumber was declared twice.
2. my (\$order,\$pntcycle,\$pnts,\$b,\$num,\$p) = (0); likely does NOT do what you want.  It's only setting the value of \$order to zero.  To set them all equal to zero, do something like my \$order = \$my \$pntcycle = my \$pnts = ... = my \$p = 0;, or individually (the way I did above).
3. It's not generally a good idea to make everything global, especially values into and out of a subroutine.  You really should be passing \$num to is_prime(), and returning the value which you assign to \$b.
4. You don't need to quote numbers when you put them into a hash -- \$prime_[\$p] = "\$j"; is better written \$prime_[\$p] = \$j.
5. It's more "Perl-like" to increment a variable with ++\$var.  \$var = \$var + 1 looks too much like, well, BASIC. :-)
6. Many will find your code easier to read if you indent it.

s''(q.S:\$/9=(T1';s;(..)(..);\$..=substr+crypt(\$1,\$2),2,3;eg;print\$..\$/
you only need to be looking at all values between 2 and sqrt(N)

I was about to tick you off, for not suggesting examining only odd numbers, but in fact your code does exactly that. Nonetheless, the fact that you are caching previously discovered primes admits an elegant optimisation:

you only need to be looking at all prime numbers between 2 and sqrt(N).

Retooling your most excellent code is left as an exercise to the reader :)

• another intruder with the mooring in the heart of the Perl

++grinder, an excellent point.

My laziness in not optimizing it further has been revealed.

As for counting by twos, I certainly could claim that my background gets the credit (I was a mathematics major), but I have to confess that many years ago I was the victim of a similar faux pas.

I was taking a Pascal course in college (my first structure programming language -- best language ever -- until I learned "C" a few years later).  The term project I had chosen was to create a program that generated patterns from prime numbers.  I wrote a similar is_prime subroutine which counted by ones, and a friend who looked at my code said "you realize of course...".  Whoops!

s''(q.S:\$/9=(T1';s;(..)(..);\$..=substr+crypt(\$1,\$2),2,3;eg;print\$..\$/
As grinder points out, since the primes are being found in ascending order, it would make sense (and go really fast) to limit the for loop to just the hash keys that are the primes found so far.

But there's another issue in comparing the OP code with liverpole's revision: the outputs are different.

If I add  warn "\$x \$y \$colr\n"; after the call to "createText" in each version, then run each of them with a "maximum number" value of 6000 and redirect stderr to a separate file for each version, both scripts produce 6000 lines of output, and the x and y values match perfectly line-by-line in the two files. But the color values are quite different, starting at the first line.

I wanted to check that, because it seemed to me that liverpole's faster version was not as "dark" as the OP version, and sure enough, the optimized/faster script produces 1694 white pixels, 4306 black ones, in contrast to the OP code, which produces only 783 white, and 5217 black. The two versions disagree on color in 911 of the 6000 outputs. (That is, all the disagreements involve points where the OP said "black" and liverpole said "white".)

It probably involves the strange use of  \$prime_ in the OP (where it probably should have been "\$p" instead of "1", but it's hard to be sure what the intent was there). update: I had assumed (probably wrongly) that the OP code was drawing primes in black, but my tabulation showed 783 white dots drawn for a max value of 6000, and that turns out to be the correct number of primes in that range. Sorry liverpole, it looks like there's a problem in your math somewhere.

Your is_prime function claims that 4 is a prime number. Would you please please please check on even-ness of a number, before you even start going through the whole checking loop?
Thanks for catching that.  I just noticed the bug in my code, which is:  although I'm caching prime numbers, I'm not testing all of them as potential factors.

I'll go update the code now, but you're right; there needs to be a check for division by 2.

s''(q.S:\$/9=(T1';s;(..)(..);\$..=substr+crypt(\$1,\$2),2,3;eg;print\$..\$/
considering
is_prime

In general it is much faster to generate an array of ones and fill it muliplicatively with zeroes. All ones left indicate a prime. Note that multiplication is usually much faster than division and it provides a kind of build-in cache.

Can you (or any fellow monk) please clarify the need for caching the prime numbers in this particular scenario?

The is_prime() check for the current \$num does not depend on previous values (1..\$num-1). I can understand the benefits of caching in other cases like fibonacci series where F(n) = F(n-1) + F(n-2). The OP is interating through N (e.g. 5000) points (\$iter = 1..N) and checking is_prime(\$iter) on each value. The \$iter values are unique and is_prime() is called once for each unique value. The data is plotted immediately and the is_prime'ness is not used further in the code.

While it is beneficial to cache the computed primes, in this particular scenario (OP's post), it is not providing any advantages. On the contrary, it ends up increasing memory usage of the application

Mahesh

Can you (or any fellow monk) please clarify the need for caching the prime numbers in this particular scenario?
I know why I would cache it.

You are familiar with the optimization for check primality by only testing with odd numbers, which has been pointed out several times already in this thread. You can optimize even more by only testing division with prime numbers. It's actually the same kind of optimization: apart from 2, all even numbers are not prime (hence, needn't be tested) because they're all divisible by 2.

This extra optimization can reduce the number of tests you need to do significantly, especially for large numbers. You are already calculating all lower prime numbers in sequence, anyway. So you've already done the footwork. All you still have to do is store the found values. And change the code so it makes use of that.

And liverpole's code doesn't make use of that. Tsk.

I currently don't have the time to build a full-fledged memoized version of is_prime (but I might come back to it later). Maybe somebody else may feel the inclination...? It's not easy, you need to keep track of upto what level you have already calculated all primes, and fall back to liverpole's code if you're above that level, possibly raising the level if the step is small enough, i.e. you just checked the next odd number.

And, uh, I suspect liverpole found himself in the same situation, and that he gave up on it, too. In that scenarion, the cache of the primes may just be a leftover.

Re: ulam's spiral too slow
by graff (Chancellor) on Apr 16, 2007 at 03:16 UTC
After seeing the difference in outputs between the OP code and liverpole's version, and then actually looking up the references at wikipedia, I realized that there were other problems besides execution speed: way too many spots were being drawn in black, which meant way too many numbers were being categorized as primes I felt compelled to figure out who was wrong; I have updated my earlier reply accordingly.

I also noticed that the OP code actually draws all the points (from 1 to \$max), both the supposed primes and the supposed non-primes. But you really only need to draw one set, and just leave the other set as "background color".

Then, I thought it would be helpful to get a sense of what took longest -- computing or drawing -- so I decided to refactor the code as follows:

• run through all the desired numeric range before doing any drawning, keeping track of the x/y coords as we go;

• each time we find a prime number, store that as both a hash key and as an item in an ascending array; use the hash value to store the x/y coords where the dot should be drawn for that prime;

• once all the primes have been found, set up the Tk::Canvas, then run through the sorted list of prime values (which are hash keys), and draw a dot for each one.

As it turns out, the computing takes about twice as long as the drawing (4 sec vs. 2 sec for a max number of 150,000, which pretty well fills the given grid); but bear in mind the compute step loops over all integers in the given range, while the drawing step only loops over the primes.

And luckily, the resulting canvas looks a lot more like the pictures I saw over at wikipedia (it still might not be entirely correct -- but you could try dumping a portion of the %primecoord hash structure to make sure the values are as intended).

Note that I used "\xb7" (centered dot) as the character -- seemed like "." (period) was off-center.

(updated to correct the comments about the "%vector" hash, to reflect the fact that coords in Tk::Canvas put 0,0 at the upper left corner)

Re: ulam's spiral too slow
by shmem (Chancellor) on Apr 15, 2007 at 20:15 UTC
This response is off-topic as it doesn't address your grievances - but for fun.. have a PostScript version below. I grabbed this one somewhere from the internet, and frobbed, twiddled and twirked it a bit to render a spiraly spiral, not a square one...

View this beast with gv.

--shmem

_(\$_=" "x(1<<5)."?\n".q·/)Oo.  G°\        /
/\_¯/(q    /
----------------------------  \__(m.====·.(_("always off the crowd"))."·
");sub _{s./.(\$e="'Itrs `mnsgdq Gdbj O`qkdq")=~y/"-y/#-z/;\$e.e && print}
Re: ulam's spiral too slow
by kyle (Abbot) on Apr 16, 2007 at 02:10 UTC

A few days back I wrote a prime generator just because I hadn't done it in a while. I thought it came out pretty nice. It caches the primes that it finds and uses them for finding the new ones (that is, it uses the algorithm discussed by other monks: checking each candidate against primes less than the candidate's square root). Here it is:

use strict; use warnings; my @primes = ( 2, 3 ); my \$candidate = \$primes[-1] + 2; my \$max_prime = shift || 100_000; CANDIDATE: while ( \$candidate < \$max_prime ) { my \$sqrt_candidate = sqrt \$candidate; PRIME: foreach my \$prime ( @primes ) { last PRIME if \$prime > \$sqrt_candidate; next CANDIDATE if 0 == \$candidate % \$prime; } push @primes, \$candidate; } continue { \$candidate += 2; } print join q{,}, @primes; print "\n";
Re: ulam's spiral too slow
by atcroft (Abbot) on Apr 15, 2007 at 19:48 UTC

As to drawing dot-by-dot in Tk, you might want to look at Dr. Mu's response to my first posting, which was a question about how to manipulate individual pixels in Tk.

HTH.

Does the following code help any?

With it, I was able to generate (on an Athlon XP 2900+) 348_100 points (590**2 points) in 83 wallclock seconds, 220_900 (470**2) in 47 wallclock seconds, and 10_000 (100**2) in 2 wallclock seconds.

(As an aside, it would probably be much quicker if there were a formula for determining the position based on the value, but I couldn't wrap my mind around one that would work. Anyone?)

HTH.

thanks very much atcroft , many ideas in your program, which will help. while searching in http://groups.google.com/group/comp.lang.perl.tk
i have found that we can draw the screen gradually, ie dot by dot with animation by inserting the line
\$mw->update;
after
\${\$img}->put( q{BLUE}, -to => ( \$image_w_2 + \$p{x}, \$image_h_2 + \$p{y} ) ) if ( \$numbers\$i );
so we can enjoy watching the process of drawing.
Re: ulam's spiral too slow
by Limbic~Region (Chancellor) on Apr 15, 2007 at 21:14 UTC
Re: ulam's spiral too slow
by ambrus (Abbot) on Jun 04, 2008 at 19:39 UTC

Just in case anyone stumbles to this node when searching for "ulam's spiral", the other ulam's spiral thread is Spiraling integers.

Create A New User
Node Status?
node history
Node Type: perlquestion [id://610218]
Approved by Corion
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others making s'mores by the fire in the courtyard of the Monastery: (5)
As of 2019-10-23 00:33 GMT
Sections?
Information?
Find Nodes?
Leftovers?
Voting Booth?
In 2019 the site I miss most is:

Results (57 votes). Check out past polls.

Notices?