Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl Monk, Perl Meditation
 
PerlMonks  

Re: Randomize lines with limited memory

by jdporter (Canon)
on Nov 01, 2003 at 22:14 UTC ( #303844=note: print w/replies, xml ) Need Help??


in reply to Randomize lines with limited memory

Divide and conquer. Chop the big file into as many files as it takes to make them a manageable size. Then randomize each of them normally (i.e. using the fisher-yates). The tricky part is the initial chopping up. You can't simply take the first 10k lines, then the second 10k lines, etc. That wouldn't give you adequate randomization (obviously). I think I would read each line, and choose an output file at random, and append that line to it. The files won't come out exactly the same size (except perhaps rarely), but that doesn't matter. You could also similarly randomize on the final joining step as well, but I'm not sure that would actually buy you anything. You could try it.

jdporter
The 6th Rule of Perl Club is -- There is no Rule #6.

  • Comment on Re: Randomize lines with limited memory

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://303844]
help
Chatterbox?
[Corion]: Meh, I found another Amazon "used" scam reseller... If a product sells for "almost new" at half the price, it's a scam, most likely...
[ambrus]: Corion: that's not true. Actually for Christmas and Thanksgiving, a lot of people buy electronics such as cameras as present, then many of them figure out they don't need it,
[ambrus]: and the electronics gets reselled almost new, but it has to be sold at half price because otherwise everyone chooses to buy the new product which has fewer risk of selling damaged products labelled as almost new.
[ambrus]: You can actually get a lot of useful cheap really almost new products that way, with only a little risk of scams.
[ambrus]: That's what some of the "Black Friday" sales are about.
[Corion]: ambrus: Well, usually, these people don't have in their description "mail me at dodgy_reseller # g m a i l | co m" , replace the "#" by "@" :)
[Corion]: Oh, and the "o" in "com" is a zero
choroba orders a camera from Ole Scæmmer

How do I use this? | Other CB clients
Other Users?
Others chanting in the Monastery: (14)
As of 2017-11-21 15:01 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    In order to be able to say "I know Perl", you must have:













    Results (304 votes). Check out past polls.

    Notices?