Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl-Sensitive Sunglasses
 
PerlMonks  

random elements from array - new twist

by felwick (Sexton)
on Nov 05, 2002 at 05:49 UTC ( #210389=perlquestion: print w/ replies, xml ) Need Help??
felwick has asked for the wisdom of the Perl Monks concerning the following question:

I've spent most of the day reading the posts here about choosing random elements from an array... my problem is that i need to go a step further and i just can't get it! here's the scenario i'm working with: i have an array with 15 elements, i need to randomly choose 5 of them (without duplicates) and then assign them to individual variables as:

$comp1 = "randomElement1";
$comp2 = "randomElement2"; and so on....

anyone? i would really appreciate a solution to this! thanks - felwick

Comment on random elements from array - new twist
Re: random elements from array - new twist
by BrowserUk (Pope) on Nov 05, 2002 at 06:02 UTC

    One way would be to shuffle the elements of the array and then use a slice to assign them to your vars.

    sub shuffle (\@) { my $r=pop; $a = $_ + rand @{$r} - $_ and @$r[$_, $a] = @$r[$a, $_] for (0..$#{$r}); } my @array = (1..15); shuffle(@array); my ($comp1, $comp2, $comp3, $comp4, $comp5) = @array[0..4];

    Nah! Your thinking of Simon Templar, originally played by Roger Moore and later by Ian Ogilvy
      success! big thanks.....
      BTW shuffle() subroutine is implemented in perl module List::Util. It is probably faster than your pure Perl implementation as it is written in C.

      --
      Ilya Martynov, ilya@iponweb.net
      CTO IPonWEB (UK) Ltd
      Quality Perl Programming and Unix Support UK managed @ offshore prices - http://www.iponweb.net
      Personal website - http://martynov.org

        use List::Util qw(shuffle); @a=qw(a b g f t h y); ($a,$b,$c) = @a[(shuffle(0..$#a))[0..2]]; print " $a $b $c ";
      You have a bad shuffle. It does not give equal odds of all combinations coming up. It trivially cannot, if your array has n elements then all of your probabilities have to be multiples of 1/n**n which does not evenly divide n! - the number of possible permutations you are choosing between.

      How to do it right is a FAQ.

        FWIW. To the best of my ability to test it, and walking on the shoulders of those better versed in statistics than I, my implementation of Fisher-Yates algorithm compares favourably with that given in the FAQ.

        Perhaps you would point out the flaws in my testing method?

        Test code

        C:\test>199981 -size=100 -iter=100000 Benchmark: running FAQ_FY, shuffl, each for at least 1 CPU seconds... FAQ_FY: 2 wallclock secs ( 1.14 usr + 0.00 sys = 1.14 CPU) @ 39 +2.64/s (n=448) shuffl: 1 wallclock secs ( 1.08 usr + 0.00 sys = 1.08 CPU) @ 44 +2.70/s (n=479) Rate FAQ_FY shuffl FAQ_FY 393/s -- -11% shuffl 443/s 13% -- permutation | FAQ_FY | shuffl ---------------------------------- A B C D: | 4140 | 4221 A B D C: | 4231 | 4211 A C B D: | 4205 | 4151 A C D B: | 4247 | 4257 A D B C: | 4076 | 4275 A D C B: | 4107 | 4244 B A C D: | 4207 | 4165 B A D C: | 4159 | 4240 B C A D: | 3941 | 4187 B C D A: | 4136 | 4116 B D A C: | 4158 | 4126 B D C A: | 4185 | 4111 C A B D: | 4126 | 4231 C A D B: | 4224 | 4122 C B A D: | 4135 | 4139 C B D A: | 4123 | 4097 C D A B: | 4179 | 4001 C D B A: | 4228 | 4175 D A B C: | 4295 | 4057 D A C B: | 4264 | 4131 D B A C: | 4206 | 4169 D B C A: | 4138 | 4239 D C A B: | 4167 | 4208 D C B A: | 4123 | 4127 ------------------------------------------------------ Std. Dev. | 72.382 | 67.643 C:\test>

        Nah! You're thinking of Simon Templar, originally played (on UKTV) by Roger Moore and later by Ian Ogilvy

      Forgive me, but why are you setting $a like this:
      $a = $_ + rand @{$r} - $_ and #...

      It seems to me that it should simply be:
      $a = rand @{$r} and #...

      Just a bit confused is all... clarify?

        It got me too (see Re: Re: Re: Re: random elements from array - new twist)... It's a matter of precedence. Have a look at how perl parses it:

        perl -MO=Deparse,-p -e "$a = $_ + rand @{$r} - $_" ($a = ($_ + rand((@{$r;} - $_)))); ^^ ^^ -e syntax OK

        The '- $_' is part of the argument to rand, not the assignment to $a.

Re: random elements from array - new twist
by rob_au (Abbot) on Nov 05, 2002 at 07:01 UTC
    The solution provided by BrowserUK above is right on the money and is particularly good if the preservation of the original array set is important. If however this is not important and larger array sets are being manipulated, the following example using splice should prove to be more efficient.

    sub pick (\@) { my $array = shift; return splice @{$array}, rand @{$array}, 1; } my @array = (1..15); my @results = (); push @results, pick @array for 1..5;

    This code differs from the other solution above in that it makes use of the splice function to remove and return an element from the array, thereby ensuring that when the pick function is called again, that same value cannot be returned.

     

    perl -e 'print+unpack("N",pack("B32","00000000000000000000000111011001")),"\n"'

Re: random elements from array - new twist
by BrowserUk (Pope) on Nov 05, 2002 at 09:01 UTC

    Another variation that's useful if the elements of the array itself are large and therefore costly to shuffle or perhaps more importantly, if its necessary to keep the array in it's original order.

    Instead of shuffling the array, create and shuffle an array of indexes, then subset it to produce a slice from the real array.

    #! perl -sw use strict; sub shufl (\@) { my $r=pop; $a = $_ + rand @{$r} - $_ and @$r[$_, $a] += @$r[$a, $_] for (0..$#{$r}); } my @array = qw(the quick brown fox jumps over the lazy dog and went st +raight into a puddle); my @indexes = (0..$#array); shufl(@indexes); my ($comp1, $comp2, $comp3, $comp4, $comp5) = @array[@indexes[0..4]]; print"$comp1, $comp2, $comp3, $comp4, $comp5\n"; __END__ #Ouput C:\test>210389.pl jumps, dog, a, the, fox C:\test>210389.pl a, the, lazy, brown, into C:\test>210389.pl and, quick, a, puddle, fox C:\test>210389.pl a, and, the, the, into C:\test>210389.pl dog, jumps, went, the, straight C:\test>210389.pl lazy, quick, and, into, fox C:\test>210389.pl the, went, the, brown, over C:\test>

    Nah! Your thinking of Simon Templar, originally played by Roger Moore and later by Ian Ogilvy
Re: random elements from array - new twist
by PodMaster (Abbot) on Nov 05, 2002 at 09:55 UTC
    Oh my ... shuffle this, splice that ... how's this?
    use strict; BEGIN{eval q{use warnings;}} { my %cache; sub Gimme { my( $arr, $n, $clear, $all ) = @_; my @ret = (); $cache{$arr} = {} if defined $clear or not exists $cache{$arr} +; %cache = () if defined $all; while($n){ my $rand = $$arr[rand@$arr]; unless(exists $cache{$arr}{$rand}){ $n--; $cache{$arr}{$rand}++; push @ret, $rand; } if( scalar %{$cache{$arr}} == @$arr ) { warn "No more unique elements left in $arr"; return @ret; } } return @ret; } } my @array = ( 1..10,'a'..'z',keys %:: ); #my($c1, $c2, $c3, $c4, $c5 ) = Gimme(\@array,5); for(1..10){ print "\n\n=head1 $_\n\n"; print "\t$_\n" for Gimme(\@array, 5); } __END__

    ____________________________________________________
    ** The Third rule of perl club is a statement of fact: pod is sexy.

Re: random elements from array - new twist
by Anonymous Monk on Nov 05, 2002 at 11:43 UTC
    If you are truely asking for a random element, then you can not add restictions. Otherwise it is not random!
      If you are truely asking for a random element, then you can not add restictions. Otherwise it is not random!

      Actually, what's being asked for here is called "random sampling without replacement" as opposed to "random sampling with replacement." That doesn't mean it's not random. Any good introductory statistics book will provide the details.

      laughingboy

Re: random elements from array - new twist
by Theseus (Pilgrim) on Nov 05, 2002 at 14:21 UTC
    Call me crazy, but wouldn't this be a much more simple solution? This results in your 5 random elements from the array being the 5 elements in %hash;
    my %hash; my @array = ('one','two','three','four','five','six','seven','eight',' +nine','ten','eleven','twelve','thirteen','fourteen','fifteen'); while ((scalar keys %hash) < 5) { my $r = $array[rand(@array)]; $hash{$r} = $r; }
      ++ for that! The only problem I can see is if the original array has duplicate information, then that won't work. If however the original array is made up of unique items, then you'd be fine I would think.

      There is no emoticon for what I'm feeling now.

        The original poster said he wanted to make sure there were no duplicates returned out of the array, that's what makes the hash perfect for this particular problem. IMHO, anyway...
Re: random elements from array - new twist
by traveler (Parson) on Nov 05, 2002 at 14:52 UTC
    You could also use Algorithm::Numerical::Shuffle which, despite the name, shuffles lists of strings, too. I have been using it reliably in a production system for a similar task for over a year with no problems.

    HTH, --traveler

Re: random elements from array - new twist
by Gilimanjaro (Hermit) on Nov 06, 2002 at 10:09 UTC
    Here's my solution... Might not be the cleanest (passing the whole array) but it sure looks pretty...
    ($\,$,)=("\n","\t"); @a=(1..15); print "Before:",@a; print "Selection:",picksome(5,@a); print "After:",@a; sub picksome { map { splice @_, rand @_, 1 } (1..shift) }
Re: random elements from array - new twist
by pizza_milkshake (Monk) on Nov 07, 2002 at 03:06 UTC
    #!/usr/bin/perl -l use strict; my @a = 1..15; my ($tmp, %h); while (keys %h < 5){ $tmp = $a[rand @a]; $h{$tmp}++ unless $h{$tmp}; } print for keys %h;
    perl -e'$_=q#: 13_2: 12/"{>: 8_4) (_4: 6/2"-2; 3;-2"\2: 5/7\_/\7: 12m m::#;s#:#\n#g;s#(\D)(\d+)#$1x$2#ge;print'

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://210389]
Approved by Zaxo
Front-paged by wil
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others wandering the Monastery: (14)
As of 2014-07-30 08:47 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    My favorite superfluous repetitious redundant duplicative phrase is:









    Results (229 votes), past polls