Beefy Boxes and Bandwidth Generously Provided by pair Networks
There's more than one way to do things
 
PerlMonks  

Re^6: Functional shuffle

by tall_man (Parson)
on Apr 03, 2005 at 04:09 UTC ( #444476=note: print w/ replies, xml ) Need Help??


in reply to Re^5: Functional shuffle
in thread Functional shuffle

Actually, selecting with an exact probability of 1/3 (or any other rational fraction less than one) is feasible with even a source of single random bits. I saw the technique described once in "Mathematical Games" in Scientific American once.

Represent 1/3 as a binary fraction: .01010101... and generate random numbers bit by bit, starting from the decimal point, calling this a new binary fraction. If the next bit we generate is 1 where there is a 0 in the target, then quit -- we are over 1/3. If the next bit is 0 where there is a 1 in the target, then quit -- we are under 1/3 and we can accept the case. If it's equal, keep generating bits. The probability we will have to continue adding bits halves with each bit.

This approach can make the Fisher-Yates shuffle arbitrarily accurate. It would be possible, but messy, to apply it to sort values that compared equal, too. With this enhancement both shuffles should be fair, but Fisher-Yates wins by being O(N) instead of O(NlogN).


Comment on Re^6: Functional shuffle
Re^7: Functional shuffle
by tlm (Prior) on Apr 03, 2005 at 04:59 UTC

    This approach can make the Fisher-Yates shuffle arbitrarily accurate. It would be possible, but messy, to apply it to sort values that compared equal, too.

    I fail to see the difference. Certainly one can make any numerically-limited algorithm "arbitrarily accurate" if one is willing to increase the number of bits used to represent the numbers in the calculation. The variation of Fisher-Yates that you propose would require a special check to handle the highly improbable case that the random number generator produced a number that was exactly equal to k/N, for any integer k in 0..N - 1, in order to generate more bits to break the tie (without this provision, the algorithm is identical to the standard Fisher-Yates as far as the uniformity of the sampling is concerned). Likewise, the tag-sorting shuffle algorithm I posted would need to be modified to handle the highly improbable case that two of the random tags happened to be identical (which would result in a stable-sort artifact), by generating more bits to break the tie.

    ...but Fisher-Yates wins by being O(N) instead of O(NlogN).

    Yes, of course, but the speed superiority of Fisher-Yates has never been in question. My position all along has been limited to defending the algorithm I posted against the claim that it was logically flawed in the same way as the sort-based algorithms discussed in When the Best Solution Isn't are. The problem with those algorithms would remain even if we had infinite-precision computers at our disposal; this is not the case for the sort-based algorithm I posted. Furthermore, in comparison to the errors incurred by those truly flawed algorithms, the errors incurred by numerically-limited algorithms like Fisher-Yates or the one I posted are entirely insignificant.

    Update: Fixed minor typo/error above: the range 0..N - 1 mentioned towards the middle of the first paragraph was erroneously given as 1..N - 1 in the original post. Also, immediately before in the same sentence, the reference to the variable k was missing.

    the lowliest monk

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://444476]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others browsing the Monastery: (8)
As of 2014-10-21 02:57 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    For retirement, I am banking on:










    Results (95 votes), past polls