Beefy Boxes and Bandwidth Generously Provided by pair Networks
more useful options
 
PerlMonks  

Is there a better way to generate unique set of random numbers ?

by chakreey (Acolyte)
on Jul 28, 2011 at 09:35 UTC ( #917225=perlquestion: print w/ replies, xml ) Need Help??
chakreey has asked for the wisdom of the Perl Monks concerning the following question:

Hello Monks,

I am trying to generate set of unique random numbers. Here is my script for doing this

use strict; use warnings; my (@RandomSet,$random_number, $element_of_array, $count); foreach(1..10){ $random_number = int(rand(1185)); $count=0; foreach $element_of_array (@RandomSet){ if ($random_number == $element_of_array){ $count++; } } push (@RandomSet,$random_number) if $count==0; } foreach $element_of_array (@RandomSet){ print "$element_of_array\n"; }

for every new random number generated, I am finding out if it is unique or a repeat. Is there a better/faster way to achieve this ?

TIA

Comment on Is there a better way to generate unique set of random numbers ?
Download Code
Re: Is there a better way to generate unique set of random numbers ?
by moritz (Cardinal) on Jul 28, 2011 at 09:44 UTC
    I'd do it like this:
    use strict; use warnings; my @random_set; my %seen; for (1..10) { my $candidate = int rand(1185); redo if $seen{$candidate}++; push @random_set, $candidate; } print join(', ', @random_set), "\n";

    If you need to generate many more than 10 distinct random numbers, this solution will be faster because the hash lookup works in constant time, whereas your solution has to iterate over the whole array for each number. For 10 there won't be a big difference.

    Also note that for testing it is better to reduce the maximum number so that collision become more likely - I tested it with 12 instead of 1185.

    Update: If the maximal number is the same or not much larger than the number of random values you want, this would probably be faster:

    use List::Util qw/shuffle/; my @random_set = (shuffle 0..11)[0..9];

    That is, first generating a random permutation, and then selecting some of the elements.

    But as always it's best to Benchmark the different solutions with real-world data.

      Thank you

      I am still a beginner in perl, can you help me understand $seen{$candidate}++ in line:

      redo if $seen{$candidate}++;
        %seen is a hash. If the key $candidate is not present in the hash, it is added to the hash (that's called "autovivification"), initialized to 1, and undef is returned - the redo is not executed.

        On the other hand if the key is already present in the hash, the corresponding value (for example 1) is returned, and incremented by one (for example set to 2). In this case the interesting part is not the incrementing, but that it returns a true value, so the redo is executed.

      If you need to generate many more than 10 distinct random numbers, this solution will be faster because the hash lookup works in constant time, whereas your solution has to iterate over the whole array for each number. For 10 there won't be a big difference.
      But do note that the only reason your algorithm guarantees to terminate is because rand uses a pseudo random number generator. Were rand to be truly random, neither algorithm could guarantee to be finished before Perl 6 dominates the world.

      Here's an algorithm that does terminate:

      use List::Util 'shuffle'; my @set = (shuffle(0..1184))[0..9];
        Were rand to be truly random, neither algorithm could guarantee to be finished before Perl 6 dominates the world.

        Though if you do the math and calculate the probability that either algorithm does not terminate with the given parameters in, let's say, 2 seconds on a modern machine, you'll probably find that the chance of computation errors in the CPU caused by cosmic rays is much higher.

        Doing the strict theory only makes sense if the actual machine corresponds to the machine model that the theory assumes.

Re: Is there a better way to generate unique set of random numbers ?
by ambrus (Abbot) on Jul 28, 2011 at 10:14 UTC
Re: Is there a better way to generate unique set of random numbers ?
by Anonymous Monk on Jul 28, 2011 at 12:55 UTC
    A PRNG will basically never "exactly repeat," and a sufficiently long string made from one won't either. You could trap key-violations on SQL inserts to catch any cosmic rays. "The way" to detect dupes in a large dataset is to disk-sort it.
Re: Is there a better way to generate unique set of random numbers ?
by pid (Monk) on Jul 29, 2011 at 07:46 UTC

    In case you want a oneliner to solve this, here I paste one that I came up with :)

    $ perl -le '@a{map{int rand 1185}(1..10)}=();print for keys %a' 373 304 286 210 816 777 596 18 766 880

Re: Is there a better way to generate unique set of random numbers ?
by FloydATC (Hermit) on Jul 29, 2011 at 21:50 UTC
    I have another suggestion that would seem slightly more linear than stabbing away at a %seen hash:

    my @numbers = (1 .. 10); while (@numbers) { my $index = rand(@numbers); my $pick = splice(@numbers, $index, 1); print "$pick\n"; }
    Edit: The first version had an off-by-one error :o)
    -- Time flies when you don't know what you're doing

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://917225]
Approved by marto
Front-paged by Arunbear
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others imbibing at the Monastery: (8)
As of 2014-09-01 11:53 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    My favorite cookbook is:










    Results (6 votes), past polls