http://www.perlmonks.org?node_id=192689

Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

I need to generate random (and unique) 6 digit numbers. I understand from doing some searching that Perl's rand function should not be relied apon as it only generates pseudo random numbers. The version of Solaris I'm using does not have a /dev/random unfortunately. So, would I be correct in assuming that I could use Math::TruelyRandom to generate a unique 6 digit number?

If I go with Perl's rand I was thinking of storing the generated numbers in a tied hash so that I could perform a lookup to see if I've ever generated that number before (just to be on the safe side). Does that make sense? What would you do?

Replies are listed 'Best First'.
Re: Generating random 6 digit numbers
by BrowserUk (Patriarch) on Aug 25, 2002 at 17:54 UTC

    #!perl -w @nums = 100000 .. 999999; # Generate all the possible numbers ## Shuffle them (use a better rand if needed) $_ ne ($a=int( rand( $#nums-$_+1))+$_) and $nums[$_] ^= $nums[$a] ^= $ +nums[$_] ^= $nums[$a] for (0..$#nums); print pop@nums; # pop a new one each time you need it

    The array takes about 50 seconds to initialise and shuffle (on my quite whimpy box) but much less storage than using a hash of the same size.

    There are less obfu versions of the Fisher-Yates shuffle around too, but this one was handy.

    Update: Having read your later posts regarding using the numbers during different runs of the program, I realise you need to save the array somewhere.

    Not having made any use of the available persistant storage mechanisms that perl provides, I'd probably do something like:

    #!perl -w @nums = 100000 .. 999999; # Generate all the possible numbers ## Shuffle them (use a better rand if needed) $_ ne ($a=int( rand( $#nums-$_+1))+$_) and $nums[$_] ^= $nums[$a] ^= $ +nums[$_] ^= $nums[$a] for (0..$#nums); open OUT, '>randoms' or die $!; print OUT pack 'I*', @numbs; close OUT or warn $!;

    And then whenI needed one of them

    .... open RNDNUM, '<randoms' or die $!; binmode(RNDNUMS); # If needed! seek RNDNUMS, -4, 2; my $newlen = tell RNDNUM; my $rndnum = do { local $/=\4; unpack 'I',<RNDNUM>; } truncate RNDNUM, $newlen or die $!; close RNDNUM or warn $!; ..... # do whatever with $rndnum

    ... but that's cos I know how to make this work. The second bit of code is untested, but demos the basic mechanism of pre-generating your random numbers and storing them somewhere, and then throwing them away once you've used them.

    Whatever you do, don't generate a random number and then try to verify whether you've used it before. That's a recipe for disaster wasteful.

    It will seem ok at first, but once your halfway through your range, you are going to (statistically) have to generate 450,000 random numbers before you find one that you haven't used before!

    (I just KNOW a mathematician is gonna correct that last bit:)

    Updated: removed falacious Infinite Improbability Drivel* cos the mathematician did!

    With apologies to Douglas Adams RIP


    What's this about a "crooked mitre"? I'm good at woodwork!
      Whatever you do, don't generate a random number and then try to verify whether you've used it before. That's a recipe for disaster. It will seem ok at first, but once your halfway through your range, you are going to (statistically) have to generate 450,000 random numbers before you find one that you haven't used before! (I just KNOW a mathematician is gonna correct that last bit:)

      If you have an urn with 50 blue marbles (representing already seen), and 50 red marbles (not yet seen), do you really think that the average number of draws (with replacement) that you need to make in order to choose a red marble is anywhere near 50? (remember, at this point the probability of drawing a red marble on a given draw is still a relatively high 0.5)
      :-)

        I knew it:) I just KNEW!

        Ok. Worst case scenario, you pick 51 balls before getting a red one. Best case, 1. so that makes the average number 26?

        In the original example, ~225,000 before getting an unused one?

        Update:

        Bah! Having submited this, I suddenly remember drawing dozens and dozens of bell-shaped graphs. I've a sneaky suspicion they've something to do with this!

        I'm gonna stick to nice comfortable metrics like "lots" in future:^p


        What's this about a "crooked mitre"? I'm good at woodwork!
      @nums = 100000 .. 999999; # Generate all the possible numbers

      I guess it depends on what is meant by "6 digit number". If it means "a number more than the greatest 5 digit number (99,999) but less than the least 7 digit number (1,000,000)", which is probably the case, then this works.

      The other possible definition is "A combination of numbers 0 through 9 taken 6 times." In that case, you could do this:
      @numbs = '000000' .. '999999';

        Hmmm! By that definition, we could all say we earned a 6-digit salary, even if it was $000,000.00.


        What's this about a "crooked mitre"? I'm good at woodwork!
(jeffa) Re: Generating random 6 digit numbers
by jeffa (Bishop) on Aug 25, 2002 at 17:48 UTC
    use strict; my %numb; my @numb = (0..9); for (0..9999) { my $numb; $numb .= $numb[rand @numb] for 0..5; $numb{$numb}++; } my @unique; for (keys %numb) { push @unique,$_ if $numb{$_} == 1; } print scalar @unique, " unique values\n";

    jeffa

    L-LL-L--L-LL-L--L-LL-L--
    -R--R-RR-R--R-RR-R--R-RR
    B--B--B--B--B--B--B--B--
    H---H---H---H---H---H---
    (the triplet paradiddle with high-hat)
    
      Is there a reason your variables are all numb? ;-)

      Makeshifts last the longest.

Re: Generating random 6 digit numbers
by sauoq (Abbot) on Aug 25, 2002 at 17:39 UTC
    Why use a tied hash when a normal old hash will do just fine? Store your numbers as the keys. Math::TrulyRandom generates numbers from interrupt timing discrepancies. It's pretty slow. Generally you should just use it to seed your pseudo random number generator. You might want to have a look at EGD - the Entropy Gathering Daemon.
    -sauoq
    "My two cents aren't worth a dime.";
    
      Thanks for your reply. The tied hash would be used because this program would run as a cron job and therefore would not know which numbers the previous run had generated.
Re: Generating random 6 digit numbers
by Arien (Pilgrim) on Aug 25, 2002 at 17:29 UTC

    What is more important: that your numbers are random or that they are unique? Your requirements are contradictory.

    — Arien

      They must be random (i.e not a sequence number) but I cannot use the same one twice. Does that make sense?
        In this case, they necessarily become less random with each one that you generate. In other words, once you have generated 999,999 of them, you know what the next one must be. You might be better off just shuffling the numbers 0 to 999,999. The reason I say so is that the more numbers you have already generated the harder it is going to be to find one that you haven't yet generated and the longer it will take. It simply isn't very efficient to generate a number, check to see if you have generated it before, and generate a new number if you have.
        -sauoq
        "My two cents aren't worth a dime.";
        

        Sure, that makes sense. In that case uniqueness seems to be of greater importance than true randomness: you just don't want the numbers to be easily guessable, correct?

        One solution is to shuffle the numbers in this range and pop or shift one every time, as sauoq mentions above.

        If this solution is desirable or not depends on what you are trying to achieve. How many numbers of those in your range will you issue? How hard do you want to make "guessing" a correct number? What are the consequences of "guessing" a valid number? These are some things to keep in mind.

        — Arien

        From the best that I can tell, the technical term for what you are trying to generate is called a "nonce"- a randomized number that is never used more than once, classically used to prevent replay in authentication protocols. If security is a real concern, your best bet will be to research standard authentication protocols and implement a proven scheme that meets your needs rather than trying to role one yourself. If you are just trying to stop someone from guessing an account number after three or four tries, you could probably get away with just using rand or Math::TruelyRandom and keep track what values have already been assigned in a hash or in some other way.

        Also, if you want, post your expectation for the total number of nonces generated, and the expected rate that they will assigned and we can do some back-of-the-envelope calculations as to if a 6 digit number will be sufficient.

        Good luck,
        cmumikey
Re: Generating random 6 digit numbers
by abell (Chaplain) on Aug 26, 2002 at 13:32 UTC
    One of the simplest pseudo-random generators uses consecutive powers of an integer modulo a prime p. If the basis integer is well chosen, then you get all numbers modulo p (apart from 0) before a repetition. For instance, taking p=7 and 3 as basis, one gets
    3^1 = 3   = 3 (mod 7)
    3^2 = 9   = 2 (mod 7)
    3^3 = 27  = 6 (mod 7)
    3^4 = 81  = 4 (mod 7)
    3^5 = 243 = 5 (mod 7)
    3^6 = 729 = 1 (mod 7)
    
    The outcome is a shuffle of the numbers (1..6). In your case, you could choose 999983 as p, wasting 17 possible numbers, or 1000003, which forces you to repeat the generation when the outcome is greater than 999999. In both cases, a number only depends on the previous one, which makes generation very easy and storage minimum. If you depend on an attacker not being able to guess the sequence, this method is far too weak.
    For your convenience, you could use
    530225 for integers modulo 999983
    and
    318611 for integers modulo 1000003

    Best regards

    Antonio Bellezza
Re: Generating random 6 digit numbers
by waswas-fng (Curate) on Aug 25, 2002 at 20:39 UTC
    What are you going to use them for? 999,999 combinations is really not ver expansive, there may be better ways to do what you are trying to do if we had more information about what it is you are trying to do. for instance if you are looking for unique way to generate urls to give a one time download or something like that 999,999 combinations would be a simple task to brute force.. toss us a lil more info

    -Waswas
      I'm going to be processing text files of new customers. Each customer in the file needs to have a random (yet unique) id generated. The requirements state that the id needs to be 6 digits. No word on how many customers I'll get, certainly no more than 999,999 (so I'm told anyway).
        Ok, I have to ask. What kind of application are you using where using pseudo random numbers for customer ids isn't good enough? I'm already amazed that sequential numbers can't be used, but not even PRNs?

        Well, even if you have 900,001 customers, you have a problem, unless ids are allowed to have leading 0s. Of course, even if you have 100,000 customers, using 6 digits for the id in combination with being truely random and unique doesn't make much sense.

        Abigail

        Why does the ID need to be random? Simply counting the customers would make them all unique.

        If it must be random I'd count the customers and change the leading zeros to a generated pseudo random number.

        Hope that helps
        Chris

        Lobster Aliens Are attacking the world!
        I've done something like that.

        Instead of using digits, I used 6 (or howevermany) random characters. I used only letters and numbers that didn't look like other characters, so it could be safe from human transcription.

        But, after generating the random file name, it's trivial to test if it's a duplicate because the directory itself is a record! You don't have to maintain your own hash. Just see if the candidate file name already exists.

        —John

        I'm a little worried that the randomness might be for security reasons. If your objective is to make a valid customer ID number hard to guess, you need to say so now and let the Monks help prevent you from making a terrible mistake.
Re: Generating random 6 digit numbers
by Vennis (Pilgrim) on Aug 26, 2002 at 12:01 UTC
    Another way, using the perl rand function,

    @nums = '0' .. '30'; # change to '000000' .. '999999'; srand; while(@nums){ push(@randnums, splice @nums,int(rand @nums),1); }

    If you want to be able to interrupt the process, you could choose to write @nums to a file, reload it and continue.

    # Test print: print join(' ',@randnums); print "\n"; print join(' ',sort {$a <=> $b} @randnums);
Re: Generating random 6 digit numbers
by John M. Dlugosz (Monsignor) on Aug 26, 2002 at 15:41 UTC
    Use a cipher as an "electronic code book". For example, fire up RC4 (which is easy to write in Perl, and allows the output to be the same length as the input for any size). Clone the RC4 object's state, since it doesn't have an ECB mode normally.

    Now, encode your counter, a 3-byte integer.

    Reset your RC4 object's state back to the beginning, then increment your counter and repeat. Each input gives a different output, and you can maintain a simple high-water mark.

    A bijective hash function will do the same thing, but it would be harder to prevent the underlying pseudorandom sequence from being discovered by watching the output for a while.

    —John

Re: Generating random 6 digit numbers
by grantm (Parson) on Aug 26, 2002 at 21:03 UTC

    Here's a Perl 1 liner to get the numbers 1 to 100 in a random sequence:

    perl -MLWP::Simple -e "getprint('http://www.random.org/cgi-bin/randseq +?min=1&max=100')";

    The nice folks at random.org won't give you 100000-999999 for free but it's a start.

Re: Generating random 6 digit numbers
by richardX (Pilgrim) on Aug 26, 2002 at 23:24 UTC
    Take your pick from the available algorithms and seed them with any pseudo random number generator (PRNG) like rand. Perl modules are available for the following algorithms:

    Blowfish, DES, TripleDES, IDEA.

    Richard

    There are three types of people in this world, those that can count and those that cannot. Anon

Re: Generating random 6 digit numbers
by plauterb (Acolyte) on Aug 27, 2002 at 13:23 UTC
    Anyone who considers arithmetical methods of producing random digits is, of course, in a state of sin.
    John Von Neumann (1951)
      There is no such thing as random.
      - Vennis (2002)

Re: Generating random 6 digit numbers
by pingo (Hermit) on Aug 27, 2002 at 12:10 UTC
    Just out of curiousity (well, mostly), is there a module that does something like Apache's mod_unique_id?
Re: Generating random 6 digit numbers
by Anonymous Monk on Aug 26, 2002 at 22:04 UTC
    You could do it enough times and concenate them.
Re: Generating random 6 digit numbers
by jotti (Scribe) on Aug 28, 2002 at 08:03 UTC
    Divide the problem into two:

    • Create a true random generator or get one. There are better and worse pseudo-random generators. Most algorithmic ones are pseudo. One attempt to a true one would be to place the computer's microphone near the fan or any other noisy equipment and read the least significant bit from the mic port and put together a n-bit number.
    • Create the 6 digit numbers.
      • If you want all 6 digits to be unique, put all digits from 0 to 9 in an array, shuffle it using your random generator and pick the 6 first digits. If first digit can't be zero and the first item in the shuffled array is zero, just pick the 6 next digits.
      • If you want each new 6-digit-number to be unique, create an array with the numbers 100,000 to 999,999 and shuffle it. Then pick them in order.