Ah, well, I know jack about FASTA files, so I didn't consider that. Of course, by changing the reader to accumulate records instead of lines, it could be adapted. Though since there are already a couple working examples from you and Marshall, and since mine has a bias in it, there's no real reason to do so.

I know that *you* know how to do the changes, but if someone tripping across this node in the future wants to do it, you can do so something (untested!) like this:

my @record; while (<$FH>) { if (/start of record marker/) { ++$cnt_recs; if ($num/$cnt_recs > rand) { my $i=@samples; if ($i > $num) { $i = rand @samples; } $samples[$i]=[$cnt_recs, [@record]]; } } else { # Accumulate record push @record, $_; } }


