Beefy Boxes and Bandwidth Generously Provided by pair Networks
Think about Loose Coupling
 
PerlMonks  

Re: "Out of memory" problem

by mbethke (Hermit)
on Nov 30, 2012 at 21:49 UTC ( #1006523=note: print w/replies, xml ) Need Help??


in reply to "Out of memory" problem

I would try and use the system's sort(1) first as it's already optimized for this kind of stuff. 500M 32-bit integers would barely fit your memory if you put them in a straight C-style array with none of Perl's overhead, so it's probably a good idea to do a disk-based mergesort. Split the file into half a dozen or so parts, run sort on them individually and then use sort once more with -m to merge the results.

If you really need to do it in Perl, you could have a look at File::Sort but I have no idea whether it works well.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://1006523]
help
Chatterbox?
erix .oO( that sounds Boyer-Mooreish )
[Corion]: erix: Yes, but that's mostly for skipping characters you'll never read, but this approach skips the problem of needing to load more data while looking at a half-match
[Corion]: I mostly wonder how I can add this "reversal" to my toolchest, and if it's worth it or just clever
[choroba]: It should greatly simplify the parsing code, no flags or additional reading of the next buffer needed

How do I use this? | Other CB clients
Other Users?
Others imbibing at the Monastery: (14)
As of 2016-12-06 15:32 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    On a regular basis, I'm most likely to spy upon:













    Results (109 votes). Check out past polls.