in reply to Re: Accuracy of Random Pi
in thread Pi calculator

I'm not sure if setting srand would really help the situation. In order for the Monte Carlo method to be most effective the random numbers produced must be the most evenly spread out. In fact, you can be more accurate without random numbers and instead going through all four corners, then their mid-points, and their mid-points' mid-points, e.g. :
. . . . . -> . . . -> etc. . . . . .

However, if you would like to use random numbers you need one of two types of random numbers. You either need true even-bias random numbers (pseudo-random can work but often don't :) ). Or you can use a particular kind of random numbers that are designed not to repeat and are garunteed to be spread evenly and more and more closely together (very similar to the systematic approach above).

Two examples are Hamilton and Sobol sequences. If you use them, you get a 1/N convergence, instead of a 1/N^2 (which you get for uniform numbers since there is nothing that keeps them from clumping up). I was going to give below "Numerical Recipe's in C"'s version of Antonov-Saleev's variant on Sobol's sequence. However it's simply too long. Also, I have to wake up for work tomorrow, and the first version I typed in (converting it to Perl from C) didn't work just right. Oh well.

Good luck, Hamilton sequences are much easier to do.


Replies are listed 'Best First'.
Re: Srand versus rand
by extremely (Priest) on Feb 16, 2001 at 12:39 UTC
    Nope, letting perl set srand is good enough. And you are right about Monte Carlo needing a purer random base than a pseudo-random generator. Most Monte Carlo's use scads of randoms per cycle and you loop the psuedo random in 2**31 calls on most systems and 2**15 on some.

    Look to CPAN and you will find some of what you need. First off, if you are going to write your own "random" sequence generator you my find PDL handy. If you want someone else to do the work on random numbers try Math::Random or Math::TrulyRandom but I would recommend you find an OS specific random source like Linux's /dev/random and /dev/urandom

    #!/usr/bin/perl -w use strict; #use Linux; ## I wish. =) (yeah yeah, $^O, etc etc) open UR, "</dev/urandom" or die "Oh man, your system sucks, $!"; my $pages= shift()+0 || 1; die "Give me a number greater than 0 or nothing, bub.\n" unless $pages +>0; while ($pages-->0) { my $buf; read UR, $buf, 512 or die "Ouch that shouldn't happen, $!"; for (0..31) { print vec($buf,$_*4,32), "\t", vec($buf,$_*4+1,32), "\t", vec($buf,$_*4+2,32), "\t", vec($buf,$_*4+3,32), "\n"; } }

    $you = new YOU;
    honk() if $you->love(perl)

      It shoulden't matter if there are repeats or not. As long as they do not apper in the same order, it woulden't really matter since I am looking to see if A^2 + B^2 is below 1. Actualy if I make 2 pairs of numbers one pair that A^2 + B^2 is less than 1, and another where it is not. Than I can generate the order in which they appear by some random factor (take a random number and each digit is used to see which of the 2 pairs I choose depending on if it's even or not). UPDATE: I now realize what I said was wrong randomly generating true or false will not generate Pi. Random numbers are sometimes very hard to visualize
        It shouldn't matter if there are repeats or not if there is a uniform bias in the random number generator. However most pseudo-random number generators fail this criteria (ones in practice, in theory they often pass :) ).

        What I was proposing you use are sometimes called quasi-random numbers. This is because they are not really too random at all. They have a heavy bias against being chosen twice. The advantage here is that you converge faster. Additionally in the real world they will be more likely to be correct (since any unatural skew in the bias would have to be tempered with the non-repeating bias).

        The reason it converges faster is because the numbers chosen are forced to not repeat, but still spread themselves evenly through a given space. I hope you can see why this would converge more quickly -- it's a tad hard to draw an ascii picture to represent this.

        Basically it's like using an iteratively increasing grid (like my last post), but then you do not have to commit to any particular number of samples (with the grid, you have to do 4 the first iteration, then 5 more the next, then 10 the next, etc).

        I have to be off to work, but as an example here is a base two Hamilton sequence: 0.0, 0.1, 0.01, 0.11, 0.001, 0.101, 0.011, and 0.111 . You may want to convert these numbers into decimal so you can see the sequence: 0.000, 0.500, 0.250, 0.750, 0.125, 0.625, 0.375, 0.875 . Normally, you would use a higher base (which works best when the base is also a prime number), but using 2 lets you see how it is slowly filling up a grid for you. Sobol sequences produce more random results than Hamilton, so using them is more ideal. But anyway, you get the picture.


Re (tilly) 1: Srand versus rand
by tilly (Archbishop) on Feb 16, 2001 at 18:12 UTC
    A trick I have mentioned before and may again.

    If you need a large supply of random looking data, one of the best approaches is to grab a large file produced out of dynamic data (/dev/mem is often a good bet, there are plenty of choices though), compress it, and then encrypt it with a good algorithm. Samples from the resulting file are for all intents and purposes, random.

      In fact, this is a good source of random data. However, because it is a good source, it could easily be bad for Monte-Carlo searching. As I've been suggesting in other posts, Monte-Carlo searching benefits from uniform-distributed numbers.

      But truely random numbers are not statistically garunteed to be uniformly distributed (no, the law of averages does not work that way :) ), and so they cause Monte-Carlo searching to converge more slowly (but do not keep it from converging).

      This is why psuedo-random numbers can, theoretically, be better than truely-random numbers, because often they are crafted to be uniformly distributed -- statistically. However, it often occurs in practice that pseudo-random numbers are not perfectly statistically uniformly distributed (what a mouthful), and so can easily lead to a mis-convergance.

      Enter Quasi-random numbers. These are uniformly-distributed and have a bias towards non-repetition. This means that you still get a garunteed convergence and you get it faster (since there would be no clumps in your set).

      Ok, time to go.


        Whether what you say about Monte-Carlo depends upon what you are trying to do. In this case, absolutely. If you are using a Monte Carlo algorithm to calculate a probability, you are certainly better off trying to use statistically uniformly distributed numbers than really random numbers for exactly the reason you say. (Speed of convergence.) Of course for the same problem you are even better off trying to turn it into an integration problem and then attempting standard numerical integration techniques. (Of course we have better means of calculating Pi, but I digress.)

        However if you are going to do a large number of complex scenarios which involve multiple random decisions, and particularly if you will then compute summary statistics on those runs, then speed of convergence or no, it is probably safer to use random data for your random decisions.

        On a related note, I remember having seen some research showing that chaotic systems can be surprisingly good at detecting pseudorandom input. So again if you are doing a Monte Carlo simulation of how a chaotic system will react, you are not guaranteed of accurate results from using pseudorandom numbers.

        So to summarize, for simple problems you are right that the right pseudorandom sequence tends to converge more rapidly. But using good random data can prevent a variety of causes of spurious results.

Re: Srand versus rand
by Fingo (Monk) on Feb 17, 2001 at 01:03 UTC
    I did a search on CPAN for a quasi-random number generation module, but sadly could not find one. I will try the systematic dot approch that you mentioned, but this has interested me very much. Can you please tell me how I could go about writing a quasi-random number generator? Building a module will be good practice anyway, even if I can go without it ;)
      Welp I can't remember off the top of my head how to do anything but Hamilton sequences. (And I'm fuzzy on one point that I'll mention below).

      Hamilton sequences aren't very random looking for quasi-random numbers, but they are better than nothing :) . You choose a base b and a number n. You then write n in base b, reverse it's digits and add a decimal point in front. Convert the number back into whatever base you want to view it in, and you are done.

      I gave an example with b = 2 and n going from 0 to 7 before (here). And here is where I'm fuzzy. I believe the best way to use Hamilton sequences is to not keep b constant and vary n.

      Rather you should vary both n and b. I beleive you increment n sequencially, but increment b to the nth prime number. This seems right, but it could also be that you make b the next larger prime than n.

      If you notice, if you keep b constant you fill in your area in a regular grid-like patern, with b dictating the initial fineness (but any b will give you an infinite sequence). This is why I'm fairly certain you should probably use one of the two techniques above to vary b, I just can't remember which one at the moment :) .

      Here is a snippet (which hopefully works) that should generate the nth number for a base b in a Hamilton sequence. There will probably be faster, more robust, and/or shorter approaches (perl golf is encouraged :) ) :

      sub hamilton { my ($n, $b) = @_; my ($a, $x) = (0, $b); while ($n) { $a += ($n % $b)/$x; $x = $x * $b; $n = int ($n / $b); } return $a; }

      Back to work. Good luck.


      p.s. Note that it would be more accurate to calculate $a by reversing this loop (that is ($n % $b)/$x gets smaller as the loop progresses, and it would be better to add the smaller values of this term first, then proceed towards larger ones). This is a floating-point rounding issue only.