Re^10: Random sampling a variable length file.

by BrowserUk (Pope)
on Dec 27, 2009 at 15:53 UTC ( #814526=note: print w/replies, xml ) Need Help??

in reply to Re^9: Random sampling a variable length file.
in thread Random sampling a variable record-length file.

Taken together, these make even the extreme case just as amenable to this method as any other. If you remember which records you've hit and do not re-sample them, you're simply omitting a segment of the number line from a uniform distribution. The distributions on either side are still uniform, i.e., random.

Thankyou again! That makes a great deal of sense.

My first reaction was that remembering whether I had already picked a record was an awkward prospect given I olny have the byte position and no nknowledge of how long it is, then it dawned on me querying the offset once I've read the partial record make for a perfect signature.

    Results (90 votes). Check out past polls.