Pathologically Eclectic Rubbish Lister | |
PerlMonks |
comment on |
( [id://3333]=superdoc: print w/replies, xml ) | Need Help?? |
These two snippets implement Algortihm S(3.4.2) and Algortihm R(3.4.2) from Knuth's Art Of programming.
The first randomly selects N items from an array of elements, and returns a reference to an array containing the elements. Note that it will not necessarily consider all of the elements in the list. The second randomly selects N items from a file of indeterminate size and returns an array containing the selected elements. Records in the file are assumed to be per line, and the lines are chomped while reading. This requires only 1 pass through the list. A slight modification can be made to use the snippet in situations where N records would exceed memory limitations, however this requires slightly more than 1 pass (/msg if you need this explained) The usual caveats about randomness in computers in general and the randomness of perls rand() function apply to these routines. NOTE The routines are not as efficient as they could be. Instead of considering a record and determining if the record should be included in the result set, they could be modified to determine how many elements to skip before including a record. That modification is left to the reader. Any errors are almost certainly in my interpretation of Knuths algortihm, and not in the actual technique. <strict>UPDATE</strict>And im so glad I put in the above caveat. ;-) Anonymonk is right, I posted an incorrect version first time. Fixed.
In reply to Random Sampling by demerphq
|
|