Re: random index in array - no duplicates
by jsprat (Curate) on Jun 07, 2002 at 00:48 UTC
|
You're looking for a shuffle. Once the array is shuffled, you can pull them sequentially. If you don't care if the elements get out of order, use an in-place shuffle. If the order of the elements is important, create a second array of indexes and shuffle that.
my $array = shift;
my $i;
for ($i = @$array; --$i; ) {
my $j = int rand ($i+1);
@$array[$i,$j] = @$array[$j,$i];
}
This is the Fisher-Yates algorithm, shuffles the array in place - very efficient, and lifted straight from perldoc -q shuffle | [reply] [d/l] [select] |
|
use List::Util 'shuffle';
@array = shuffle @array;
-sam
| [reply] [d/l] |
|
One problem with this, however, is that if the array contains multiple elements with the same value, they will still show up multiple times. Best to filter them out first.
| [reply] |
|
| [reply] |
Re: random index in array - no duplicates
by crazyinsomniac (Prior) on Jun 07, 2002 at 00:38 UTC
|
An option is to use splice. It's probably not the most efficient (you can always benchmark to make sure), but otherwise you can't get away with a single pass (at least not using rand, there probably is a perl module to only spit out unique random values, but its destined to be less efficient than just keeping track yourself).
| [reply] |
|
A splice isn't efficient. It's essentially
quadratic in the size of the array, while a shuffle is
linear.
Or to be more precise. If you have an array of size N
and you want to draw M elements from it (without
duplicates), splice will take time proportional
to N * M, while the suffle will take time proportional
to N + M (or just M if you get a reference to
the array).
Abigail
| [reply] [d/l] [select] |
Re: random index in array - no duplicates
by jryan (Vicar) on Jun 07, 2002 at 05:23 UTC
|
As a bit of a golf Challenge:
$r=(grep!$$_++,@a)[rand@a] #evil
That probably doesn't help you so much though, does it? Lets work through the first one, and re-write it a bit clearer as we go along.
# create a counter to store the number of times the variable has
# been encountered.
my %counts;
while (1)
{
my %h = %counts;
# Allow only the elements that haven't been seen yet. Check the
# perlfaq for more details
my @unique = grep {!$h{$_}++} @array;
# if @unique has no elements, there aren't any non-duplicates left
last unless (@unique);
# grab a random index
my $random_index = int(rand(@unique));
# get a random element
my $random_element = $unique[$random_index];
print "Random non duplicate element is: ", $random_element, "\n";
# add the element to the counter so that it doesn't get set
$counts{$random_element}++;
}
Thats a bit less efficient than the first example, but much more readable. If you have any questions about it, just respond.
| [reply] [d/l] [select] |
|
Thanks for the suggestion. I tried incorporating your code exactly as it is into my program, but I was still getting duplicate values. It was probably a mistake on my part. Either way, I followed your logic and was able to implement my own version that is working great! Thanks!
| [reply] |
Re: random index in array - no duplicates
by boo_radley (Parson) on Jun 07, 2002 at 00:54 UTC
|
my @foo = (1..100);
my @bar= sort {rand(1)<=>rand(1)} @foo;
print "how many elements?";
my $num = <STDIN>; chomp $num;
print join" ", @bar [(0..--$num)];
| [reply] [d/l] |
Re: random index in array - no duplicates
by oakbox (Chaplain) on Jun 07, 2002 at 10:04 UTC
|
This is overly verbose, but that's okay. Basically I'm making a copy of the array, iterating over the original array $#array times, and removing the randomly selected element from the copy on each iteration.
That means that each element is used only once and that the resulting array is basically a randomized version of the input array.
#!/usr/bin/perl
@arr=("1","2","3","4");
@sub=@arr;
foreach (@arr){
$slot = int(rand(@sub));
($new) = splice(@sub,$slot,1);
push(@newarr,$new);
}
foreach (@newarr){
print "$_\n";
}
And then scrunch up that code a little:
$hold=@arr;
foreach (0...$hold){
push(@newarr, splice(@arr,int(rand(@arr)),1));
}
Still seems inelegant though . . . hmmmm.
-oakbox | [reply] [d/l] [select] |
Re: random index in array - no duplicates
by shambright (Beadle) on Jun 07, 2002 at 19:42 UTC
|
How about something like...
my %seen=();
$array_size = @array;
$index = "";
until ($index ne "") {
my $dbl_chk = keys %seen;
if ($dbl_chk == $array_size) { #Every array element has been seen
+once.
%seen = (); #start over!
}
my $new = int(rand($array_size));
if (!(exists($seen{$new}))) {
$seen{$new} = 1;
$index = $new;
}
}
If you need to know which ones have been seen already:
@seen = keys %seen; #array index of "seen" values.
or what if the random number generator continues to produce "seen" index values
--you're either very unlucky, or you need a better random number generator.
update in the future, it would probably serve me better to read the entire post. The "shuffle" is a better answer than mine. Although, on large arrays it might be faster just to read elements until you find an unread one than do a random sort. I'm not sure.
| [reply] [d/l] [select] |
Re: random index in array - no duplicates
by joecamel (Hermit) on Jun 08, 2002 at 01:25 UTC
|
A quick way to get a list of unique values is to store them as keys in a hash.
my @ary = qw(a a b c d e f f f g h i i j k l l);
my %unique = map { $_, 1 } @ary;
my @randUnique = sort { rand(1) <=> rand(1) } keys %unique;
joecamel | [reply] [d/l] |
Re: random index in array - no duplicates
by Anonymous Monk on Jul 07, 2016 at 17:58 UTC
|
another thing (not optimized but i hope clear enough)
not exactly for the same use (i needed to keep the values of a column) so it is more an array to anonymise some labels. i use it for blind tests
my @tab_anonyme;
for $i (0..$#my_data) {
$tab_anonyme[$i][0] = $my_data[$i][0]; # data to keep
$tab_anonyme[$i][1] = int ( rand (100) ) + 1; # anonyme label
}
# verification
my $flag_duplicate;
do {
$flag_duplicate = "false";
for $i (0..$#tab_anonyme) {
for $i2 (0..$#tab_anonyme) {
if ($i2 == $i) { next; } # no testing of a value against i
+tself
if ( $tab_anonyme[$i][1] == $tab_anonyme[$i2][1] ) {
$tab_anonyme[$i2][1] = int ( rand (100) ) + 1;
$flag_duplicate = "true";
}
}
}
} while ( $flag_doublon eq "true" );
two problems
- in rand(X), if X is smaller than the number of lines of the array @mydata => infinite loop, make it four or five time bigger
- time taken is not very predictible | [reply] [d/l] |
|
The answer is still shuffle. Your problem can be broken down in two parts: (1) constructing a list of random labels; (2) assembling the required data structure. Shuffle solves the first part, tutorials (perldsc, perllol) for working with structured data may help you with the latter.
use List::Util qw( shuffle );
my @data = map { [$_, undef] } qw( v1 v2 v3 v4 v5 );
my @shuf = shuffle 0..$#data;
my @tab = map { [$data[$_][0], $shuf[$_]] } 0..$#data;
use Data::Dumper;
print Data::Dumper->Dump([\@data, \@tab], [qw(*data *tab)]);
Please, if you need help or advice with your programming tasks, post them as new questions in Seekers of Perl Wisdom. | [reply] [d/l] [select] |