Beefy Boxes and Bandwidth Generously Provided by pair Networks
good chemistry is complicated,
and a little bit messy -LW

Re: Randomize an array

by BooK (Curate)
on Sep 08, 2000 at 00:16 UTC ( #31477=note: print w/replies, xml ) Need Help??

in reply to Randomize an array

Well, if you don't want to install yet another module, and the randomness you seek is no better than the one provided by rand, you can try the following:
sort { (-1,0,1)[rand 3] } @array

Replies are listed 'Best First'.
RE (tilly) 2: Randomize an array
by tilly (Archbishop) on Sep 08, 2000 at 00:26 UTC
    Having 0 in that list is a mistake since the qsort algorithm takes that to mean that they are equal and hence it only needs to use one of those to compare with others.

    A test that is sensitive to this is looking for "rising sequences". Scramble a sorted list. Make successive passes through the array, looking for the first, then second, then third, then fourth, etc elements. How many passes did it take? With your algorithm it will take fewer passes than with a well-shuffled array.

    ObTrivia: Take a fresh deck of cards. Shuffle a few times. Do the above test by suit. 2 suits start off sorted front to back, 2 back to front. This test is able to detect that for a long time. (Certainly well after the "7 shuffles leaves a well-shuffled deck" factoid a lot of people hear. By some measures it is well shuffled. By this one it still isn't. :-)

      I completely agree with you. 0 should not be here. It's just that we talked about

      sort { rand(3) - 1 } @array
      at the september 1999 meeting.

      I just remembered the sort rand trick and shared it. So here comes an updated version:

      sort { (-1,1)[rand 2] } @array
        I was trying to figure out how to explain why this was going to give a perfect sort, and I realized that in fact this trick simply cannot give a perfect sort for an array of size larger than 2 when fed an algorithm with bounded performance!

        This is cute.

        As the above makes clear, you are basically making random coin flips until it spits out an answer. If the array has n entries, there is some maximum number of flips, say k, that you could be asked for.

        Turning it around on its head I could just ask you up front to make k coin flips, then feed that into the algorithm, and get an answer. Now there are 2^k possible sequences that you could give me, and they are all equally likely. So if you write in base 2, the probability of any particular permutation is 2^(-k) times the number of sequences that gave that permutation.

        What that is I don't care. What I do care about is that when written out in base 2 that probability is a decimal that terminates after at most k places. But for each permutation the probability we really want is 1/n!, and if n is bigger than 2 then n! is divisible by 3 which is *not* a power of two so 1/n! has to be (base 2) an infinite repeating decimal!

        In other words I may not, without calculating it, have any idea how your anwer will be biased, but in fact it is!

        ObTrivia: A somewhat similar counting argument manages to show that the 400 year Gregorian calendar repeats days of the week, but cannot divide a particular day of the month between them. However it takes explicit calculation to show that the 13'th falls on Friday more than 1 day in 7.

      Having 0 in that list is a mistake since the qsort algorithm takes that to mean that they are equal and hence it only needs to use one of those to compare with others.

      I personally believe that before someone else points out that the node I'm replying to is some eight years old, I'd better do so myself: the reason why I'm doing it that I've seen this very thread referenced quite recently; which means that some people may still be looking here for info they will try to actually use. And here a terribly wrong technique has been advertised, so I'm writing this post for the benefit of potential readers to the effect of warning them not to use it! Namely, even if amended of the "0 in that list" problem, the general idea of shuffling a list by sorting it with a routine returning random values e.g.:

      my @shuffled = sort {.5<rand} @input;

      is flawed! Several reasons why it is are explained in a thread which came two years later. To put it briefly, if you really want to use sort for this, at the very least you must generate the list of random values in advance, whatever actual technique you actually choose in the end. E.g.:

      my %r; my @shuffled = sort { ($r{$a}||=rand) <=> ($r{$b}||=rand) } @input;


      my @r = map rand, @input; my @shuffled = @input[sort {$r[$a] <=> $r[$b]} 0..$#input];

      Update: as a novice, back in my old days at the Monastery whenever I happened to see a post of mine which -exactly like this one and its companion- turned out to have a negative reputation, I would edit it to the effect of inserting an update and whine about the downvotes or more precisely asking why it earned them. Of course, I've learnt how to do better in the meantime, and generally refrain to. Except this time: since I would like to "thank" all those geniuses who downvoted this node on part of the poor newbie who will stumble upon it one day, and judging from the reputation will think that my complaints are moot, thus possibly following the advice of the broken shuffling technique... only to be bitten in the neck, but hopefully not later than having spread the word, because "it is so cool..."

      If you can't understand the incipit, then please check the IPB Campaign.
        First of all, I did not recommend that sorting method.

        Secondly in this very thread I already explained why this randomization technique could not possibly provide perfectly random answers.

        So while I commend your desire to add useful information to old threads, I don't think you picked the right target in this case.

Re^2: Randomize an array
by Anonymous Monk on Jul 01, 2008 at 20:57 UTC
    That code does not work.

      Are you sure?

      #!/usr/bin/perl # use strict; use warnings; my @array=0..100; @array=sort { (-1,0,1)[rand 3] } @array; print join(',',@array),"\n";
      sini@ordinalfabetix:~$ ./ 70,61,28,24,5,52,41,57,21,53,82,10,34,86,29,12,46,2,0,1,56,14,47,4,3,6 +,58,8,7,20,9,72,17,11,13,18,68,25,55,22,67,54,23,15,90,16,19,40,42,43 +,59,30,65,36,60,27,95,76,62,88,31,66,26,33,32,38,37,35,64,50,39,44,69 +,45,81,51,63,74,49,48,77,84,87,94,71,80,79,73,83,75,85,78,89,92,91,93 +,98,99,96,100,97 sini@ordinalfabetix:~$

      Rule One: "Do not act incautiously when confronting a little bald wrinkly smiling man."

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://31477]
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others imbibing at the Monastery: (6)
As of 2020-02-29 13:13 GMT
Find Nodes?
    Voting Booth?
    What numbers are you going to focus on primarily in 2020?

    Results (128 votes). Check out past polls.