laziness, impatience, and hubris PerlMonks

### Is this a fair shuffle?

by saintmike (Vicar)
 on May 02, 2004 at 02:03 UTC Need Help??
saintmike has asked for the wisdom of the Perl Monks concerning the following question:

Hey monks,

is this a fair way to shuffle an array:

```    my @a = (1..10);
my @b;
push @b, splice @a, rand @a, 1 while @a;
Algorithm::Numerical::Shuffle's or List::Util's shuffle() are ok, but if it can be done in a one liner shorter than the one shown in How do I shuffle an array or in the FAQ, that'd be preferable.

And, does any of you perlgolfer-monks have an idea on how to transform the snippet above into one that's working in-place?

Replies are listed 'Best First'.
Re: Is this a fair shuffle?
by BrowserUk (Pope) on May 02, 2004 at 05:29 UTC

It is fair*, and pretty slick for a pure perl implementation too, but the XS version of List::Util shuffle is 6x faster.

Even if you do an in-place version,

```sub sm{
my \$n = @{ \$_[ 0 ] };
push @{ \$_[ 0 ] }, splice @{ \$_[ 0 ] }, rand \$n--, 1 while \$n;
}

you gain very little.

* That is to say, it is a correct implementation of the Fischer-Yates shuffle, and is therefore fair.

Examine what is said, not who speaks.
"Efficiency is intelligent laziness." -David Dunham
"Think for yourself!" - Abigail
The algorithm is fair, but it's not an implementation of the Fisher-Yates shuffle. In fact, the solution has a pretty poor assymptotic running time: Ω (n²). This is due to the splicing of the array. Splicing out a single element of an array takes, on average, time linear to the length of the array. The algorithm presented by the OP splices out elements of successively smaller arrays, but it still adds up to quadratic time.

I've been pushing the Fisher-Yates shuffle instead of the splicing shuffle since 1995 or so. Since then, it has made its way into the FAQ, we have List::Util::shuffle, but despite the FAQ spelling out what's wrong with the splicing algorithm, that one just doesn't want to die.

Abigail

Okay Abigail, I agree with you on the O(nē) thing with regard to the performance of the implementation of the splice versions, though it didn't seem "slow" in my original tests.

I ran tests with 10, 100, & 1000 elements, and as well as beating the pure Perl implementations of F_Y comfortably, nothing in the numbers actually screamed "quadratic" at me.

```P:\test>200083

Shuffling 10 elements
Rate PPcpy PPipl SPcpy SPipl XS_FY
PPcpy  38428/s    --  -22%  -45%  -56%  -78%
PPipl  49550/s   29%    --  -28%  -43%  -72%
SPcpy  69263/s   80%   40%    --  -20%  -60%
SPipl  86713/s  126%   75%   25%    --  -50%
XS_FY 174192/s  353%  252%  151%  101%    --

Shuffling 100 elements
Rate PPcpy PPipl SPcpy SPipl XS_FY
PPcpy  4346/s    --  -30%  -42%  -53%  -78%
PPipl  6179/s   42%    --  -17%  -34%  -68%
SPcpy  7465/s   72%   21%    --  -20%  -61%
SPipl  9343/s  115%   51%   25%    --  -52%
XS_FY 19323/s  345%  213%  159%  107%    --

Shuffling 1000 elements
Rate PPcpy SPcpy PPipl SPipl XS_FY
PPcpy  442/s    --  -29%  -30%  -39%  -77%
SPcpy  625/s   41%    --   -1%  -14%  -68%
PPipl  632/s   43%    1%    --  -13%  -68%
SPipl  726/s   64%   16%   15%    --  -63%
XS_FY 1957/s  342%  213%  210%  169%    --

However, since reading your post, I did runs of 10_000 and 100_000 and only now the difference begins to show up.

```Shuffling 10000 elements
Rate SPipl SPcpy PPcpy PPipl XS_FY
SPipl 21.2/s    --  -26%  -51%  -66%  -89%
SPcpy 28.8/s   36%    --  -33%  -55%  -85%
PPcpy 43.0/s  103%   49%    --  -32%  -77%
PPipl 63.3/s  198%  120%   47%    --  -66%
XS_FY  187/s  782%  550%  335%  196%    --

Shuffling 100000 elements
(warning: too few iterations for a reliable count)
(warning: too few iterations for a reliable count)
Rate SPipl SPcpy PPcpy PPipl XS_FY
SPipl 0.262/s    --  -43%  -93%  -95%  -98%
SPcpy 0.464/s   77%    --  -88%  -92%  -97%
PPcpy  3.90/s 1386%  741%    --  -29%  -74%
PPipl  5.52/s 2005% 1090%   42%    --  -63%
XS_FY  14.8/s 5530% 3084%  279%  167%    --

And that shows the transition quite dramatically. In my defense, I was only really checking for it's fairness which I did using the code you'll recognise from an old post of yours.

```permutation |   XS_FY  |   PPcpy  |  PPipl   |  SPcpy   |   SPipl
------------------------------------------------------------------
A B C D:   |   4104   |   4195   |   4098   |   4218   |    4229
A B D C:   |   4148   |   4212   |   4198   |   4170   |    4052
A C B D:   |   4116   |   4112   |   4195   |   4164   |    4240
A C D B:   |   4194   |   4151   |   4212   |   4052   |    4219
A D B C:   |   4181   |   4221   |   4223   |   4227   |    4295
A D C B:   |   4238   |   4140   |   4053   |   4195   |    4202
B A C D:   |   4224   |   4195   |   4182   |   4176   |    4319
B A D C:   |   4172   |   4073   |   4075   |   4196   |    4128
B C A D:   |   4233   |   4169   |   4201   |   4148   |    4220
B C D A:   |   4173   |   4220   |   4174   |   4204   |    4123
B D A C:   |   4112   |   4127   |   4109   |   4197   |    4167
B D C A:   |   4080   |   4189   |   4139   |   4148   |    4116
C A B D:   |   4220   |   4159   |   4167   |   4222   |    4107
C A D B:   |   4091   |   4212   |   4264   |   4126   |    4128
C B A D:   |   4191   |   4169   |   4039   |   4144   |    4150
C B D A:   |   4178   |   4173   |   4236   |   4181   |    4118
C D A B:   |   4171   |   4247   |   4134   |   4231   |    4212
C D B A:   |   4229   |   4142   |   4283   |   4205   |    4251
D A B C:   |   4092   |   4165   |   4107   |   4157   |    4120
D A C B:   |   4089   |   4083   |   4278   |   4117   |    4026
D B A C:   |   4049   |   4127   |   4142   |   4061   |    4204
D B C A:   |   4155   |   4144   |   4152   |   4085   |    4160
D C A B:   |   4272   |   4217   |   4149   |   4195   |    4040
D C B A:   |   4288   |   4158   |   4190   |   4181   |    4174
------------------------------------------------------------------
Std. Dev.   |  64.518  |  44.318  |  66.406  |  49.629  |  75.544

The only performance issue I considered was relative to the List::Util XS implementation. It was, as expected, considerably slower and that was the main point.

Now the bit where I got confused. I thought about the difference between, say the pure-perl/copying and the Splicing/copying versions, and the main difference is that the former swaps contents of elements whereas that latter swaps linked elements. I concluded that the difference between the two was an "implementation detail", in the same way as the difference between the pure-perl/copying and the XS version is an implementation detail, and therefore didn't change the nature of the basic algorithm being used, hence the addendum of "it's a Fischer Yates". I was wrong!

I'm still a little bemused by why swapping pointers on the linked list, rather than swapping the contents of the elements the linked list points at, becomes quadratic, but the (newer) numbers demonstrate your point. I will have to sit down with a pen and paper and the source code of splice to understand why the costs grow that way.

So, thanks for setting me straight.

Examine what is said, not who speaks.
"Efficiency is intelligent laziness." -David Dunham
"Think for yourself!" - Abigail
Re: Is this a fair shuffle?
by sgifford (Prior) on May 02, 2004 at 02:52 UTC

It looks basically OK for most purposes (although if I were using it for security purposes I'd use one of the published and widely recognized shuffling algorithms).

Here's a oneliner that does the equivalent thing inplace. It walks through the array, and for the current element randomly picks one of the remaining elements and swaps it with the current element. It's not actually shorter than yours, so I'm not sure there's much use for it.

```for (0..\$#a) { my \$r = rand(@a)-\$_; @a[\$_,\$r]=@a[\$r,\$_]; };
Re: Is this a fair shuffle?
by Gunth (Scribe) on May 02, 2004 at 02:36 UTC
It's okay. I suggest you use List::Util though. Here is the code in List::Util:
```sub shuffle (@) {
my @a=\(@_);
my \$n;
my \$i=@_;
map {
\$n = rand(\$i--);
(\${\$a[\$n]}, \$a[\$n] = \$a[\$i])[0];
} @_;
}
-Will
Here is the code in List::Util

Just as a point of information List::Util will use an XS version if possible, which is considerably faster than the Perl one listed above.

Really there isn't any reason not to use List::Util if its available, and it's been core since 5.007003.

Create A New User
Node Status?
node history
Node Type: perlquestion [id://349722]
Approved by b10m
help
Chatterbox?
and all is quiet...

How do I use this? | Other CB clients
Other Users?
Others making s'mores by the fire in the courtyard of the Monastery: (8)
As of 2018-03-18 10:54 GMT
Sections?
Information?
Find Nodes?
Leftovers?
Voting Booth?
When I think of a mole I think of:

Results (230 votes). Check out past polls.

Notices?