Beefy Boxes and Bandwidth Generously Provided by pair Networks
Come for the quick hacks, stay for the epiphanies.
 
PerlMonks  

comment on

( [id://3333]=superdoc: print w/replies, xml ) Need Help??

The argument in the cited article apply to any shufling algorithm that uses anything other than random binary 2k-ary choices (in a program running on a binary-representation computer). This is true even if the numbers are generated by an ideal uniform random process (as opposed to a PRNG), and then stored in computer memory (necessarily truncated to a finite precision). In other words, the problem is deeper than the one caused by the finite periods of PRNGs. Consider applying Fisher-Yates to sorting an array of 3 elements. The first step requires selecting one of three options at random, and is done by testing whether a "random" number (or rather, a rational number obtained from truncating the precision of a randomly generated real number) stored in some computer register is within certain bounds or not. The cardinality of the set of such numbers is 2w, where w is the number of bits in the register, and therefore it is impossible that each of the three elements in the list will be chosen with probability of exactly 1/3. With the exception of the very last one of those in which the element to swap is chosen from among 2k alternatives, for some integer 0 < k < w, all the swaps in Fisher-Yates (under the conditions stated) result from a non-uniform sampling.

The question remains of whether the magnitude of this deviation from perfect uniformity is one that can significantly impair the intended use of the algorithm, and the answer of course depends on the application. In the case of the example above, the magnitude of the relative error grows as N/2w, so I imagine that a simulation that relies on a uniform sampling of the space of all rearrangements of a large number of elements may have to start worrying about this effect.

I reiterate that there is a fundamental difference between the flawed algorithms mentioned in When the Best Solution Isn't, and those that are based on sorting a set of pseudo-random number tags (like the one I posted). With the former, the deviation from the correct distribution can be substantial, and would occur even if one could use infinite precision, perfectly uniformly sampled random numbers, whereas with the latter this deviation is no bigger than it is for any shuffle algorithm that uses anything other than random binary 2k-ary choices (and of course, would disappear if the random numbers were of infinite precision). Therefore, the flaws in the former algorithms are logical flaws independent of numerical errors.

Update: There's a small inaccuracy in the argument I presented above, but correcting it does not affect its gist. Above I say that uniform sampling can occur only when the number of choices is 2; this is incorrect; it can occur only when the number of choices is 2k, for some integer 0 < k < w, where w is as defined above.

the lowliest monk


In reply to Re^5: Functional shuffle by tlm
in thread Functional shuffle by Roy Johnson

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":



  • Are you posting in the right place? Check out Where do I post X? to know for sure.
  • Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
    <code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
  • Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
  • Want more info? How to link or How to display code and escape characters are good places to start.
Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others studying the Monastery: (5)
As of 2024-03-19 02:38 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found