Beefy Boxes and Bandwidth Generously Provided by pair Networks
"be consistent"
 
PerlMonks  

random index in array - no duplicates

by emilford (Friar)
on Jun 07, 2002 at 00:31 UTC ( [id://172400]=perlquestion: print w/replies, xml ) Need Help??

emilford has asked for the wisdom of the Perl Monks concerning the following question:

Does anyone have an easy solution for pulling a random element from an array, while at the same time making certain this random element isn't a duplicate? If I had a list of 10-15 elements in an array, how could I be sure to not pull a duplicate random element?

I've used this code to grab an index within range,
$array_size = @array; $index = int(rand($array_size));
but what if I want to be sure that I don't get a duplicate index? I could push each "seen" index into another array and then compare the random index to all already seen index values, looping until a fresh index is found. The only problem I see is that it may take several iterations before finding an available index...or what if the random number generator continues to produce "seen" index values?

Does anyone have an efficient method for achieving this? Thanks in advance.

Replies are listed 'Best First'.
Re: random index in array - no duplicates
by jsprat (Curate) on Jun 07, 2002 at 00:48 UTC
    You're looking for a shuffle. Once the array is shuffled, you can pull them sequentially. If you don't care if the elements get out of order, use an in-place shuffle. If the order of the elements is important, create a second array of indexes and shuffle that.

    my $array = shift; my $i; for ($i = @$array; --$i; ) { my $j = int rand ($i+1); @$array[$i,$j] = @$array[$j,$i]; }

    This is the Fisher-Yates algorithm, shuffles the array in place - very efficient, and lifted straight from perldoc -q shuffle

      Or simpler still:

      use List::Util 'shuffle'; @array = shuffle @array;

      -sam

      One problem with this, however, is that if the array contains multiple elements with the same value, they will still show up multiple times. Best to filter them out first.
        The OP wrote "but what if I want to be sure that I don't get a duplicate index?"

        I'm pretty sure he was worrying about pulling the same element twice from the array, not worrying about duplicate elements. If there are duplicate elements, yes he would want to weed them out first.

Re: random index in array - no duplicates
by crazyinsomniac (Prior) on Jun 07, 2002 at 00:38 UTC
      A splice isn't efficient. It's essentially quadratic in the size of the array, while a shuffle is linear.

      Or to be more precise. If you have an array of size N and you want to draw M elements from it (without duplicates), splice will take time proportional to N * M, while the suffle will take time proportional to N + M (or just M if you get a reference to the array).

      Abigail

Re: random index in array - no duplicates
by jryan (Vicar) on Jun 07, 2002 at 05:23 UTC
    As a bit of a golf Challenge:
    $r=(grep!$$_++,@a)[rand@a] #evil

    That probably doesn't help you so much though, does it? Lets work through the first one, and re-write it a bit clearer as we go along.

    # create a counter to store the number of times the variable has # been encountered. my %counts; while (1) { my %h = %counts; # Allow only the elements that haven't been seen yet. Check the # perlfaq for more details my @unique = grep {!$h{$_}++} @array; # if @unique has no elements, there aren't any non-duplicates left last unless (@unique); # grab a random index my $random_index = int(rand(@unique)); # get a random element my $random_element = $unique[$random_index]; print "Random non duplicate element is: ", $random_element, "\n"; # add the element to the counter so that it doesn't get set $counts{$random_element}++; }

    Thats a bit less efficient than the first example, but much more readable. If you have any questions about it, just respond.

      Thanks for the suggestion. I tried incorporating your code exactly as it is into my program, but I was still getting duplicate values. It was probably a mistake on my part. Either way, I followed your logic and was able to implement my own version that is working great! Thanks!
Re: random index in array - no duplicates
by boo_radley (Parson) on Jun 07, 2002 at 00:54 UTC

    do you mean something like :

    my @foo = (1..100); my @bar= sort {rand(1)<=>rand(1)} @foo; print "how many elements?"; my $num = <STDIN>; chomp $num; print join" ", @bar [(0..--$num)];
Re: random index in array - no duplicates
by oakbox (Chaplain) on Jun 07, 2002 at 10:04 UTC
    This is overly verbose, but that's okay. Basically I'm making a copy of the array, iterating over the original array $#array times, and removing the randomly selected element from the copy on each iteration.

    That means that each element is used only once and that the resulting array is basically a randomized version of the input array.

    #!/usr/bin/perl @arr=("1","2","3","4"); @sub=@arr; foreach (@arr){ $slot = int(rand(@sub)); ($new) = splice(@sub,$slot,1); push(@newarr,$new); } foreach (@newarr){ print "$_\n"; }
    And then scrunch up that code a little:
    $hold=@arr; foreach (0...$hold){ push(@newarr, splice(@arr,int(rand(@arr)),1)); }
    Still seems inelegant though . . . hmmmm.
    -oakbox
Re: random index in array - no duplicates
by shambright (Beadle) on Jun 07, 2002 at 19:42 UTC
    How about something like...
    my %seen=(); $array_size = @array; $index = ""; until ($index ne "") { my $dbl_chk = keys %seen; if ($dbl_chk == $array_size) { #Every array element has been seen +once. %seen = (); #start over! } my $new = int(rand($array_size)); if (!(exists($seen{$new}))) { $seen{$new} = 1; $index = $new; } }
    If you need to know which ones have been seen already:
    @seen = keys %seen; #array index of "seen" values.

    or what if the random number generator continues to produce "seen" index values
    --you're either very unlucky, or you need a better random number generator.

    update in the future, it would probably serve me better to read the entire post. The "shuffle" is a better answer than mine. Although, on large arrays it might be faster just to read elements until you find an unread one than do a random sort. I'm not sure.

Re: random index in array - no duplicates
by joecamel (Hermit) on Jun 08, 2002 at 01:25 UTC
    A quick way to get a list of unique values is to store them as keys in a hash.
    my @ary = qw(a a b c d e f f f g h i i j k l l); my %unique = map { $_, 1 } @ary; my @randUnique = sort { rand(1) <=> rand(1) } keys %unique;
    joecamel
Re: random index in array - no duplicates
by Anonymous Monk on Jul 07, 2016 at 17:58 UTC
    another thing (not optimized but i hope clear enough) not exactly for the same use (i needed to keep the values of a column) so it is more an array to anonymise some labels. i use it for blind tests
    my @tab_anonyme; for $i (0..$#my_data) { $tab_anonyme[$i][0] = $my_data[$i][0]; # data to keep $tab_anonyme[$i][1] = int ( rand (100) ) + 1; # anonyme label } # verification my $flag_duplicate; do { $flag_duplicate = "false"; for $i (0..$#tab_anonyme) { for $i2 (0..$#tab_anonyme) { if ($i2 == $i) { next; } # no testing of a value against i +tself if ( $tab_anonyme[$i][1] == $tab_anonyme[$i2][1] ) { $tab_anonyme[$i2][1] = int ( rand (100) ) + 1; $flag_duplicate = "true"; } } } } while ( $flag_doublon eq "true" );
    two problems

    - in rand(X), if X is smaller than the number of lines of the array @mydata => infinite loop, make it four or five time bigger

    - time taken is not very predictible

      The answer is still shuffle. Your problem can be broken down in two parts: (1) constructing a list of random labels; (2) assembling the required data structure. Shuffle solves the first part, tutorials (perldsc, perllol) for working with structured data may help you with the latter.

      use List::Util qw( shuffle ); my @data = map { [$_, undef] } qw( v1 v2 v3 v4 v5 ); my @shuf = shuffle 0..$#data; my @tab = map { [$data[$_][0], $shuf[$_]] } 0..$#data; use Data::Dumper; print Data::Dumper->Dump([\@data, \@tab], [qw(*data *tab)]);

      Please, if you need help or advice with your programming tasks, post them as new questions in Seekers of Perl Wisdom.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://172400]
Approved by samtregar
Front-paged by jeffa
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others making s'mores by the fire in the courtyard of the Monastery: (4)
As of 2024-04-20 06:10 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found