|
|
| Keep It Simple, Stupid | |
| PerlMonks |
Re: sorting very large text filesby gam3 (Curate) |
| on Jan 06, 2010 at 03:40 UTC ( #815874=note: print w/ replies, xml ) | Need Help?? |
|
If the file is not so big that the keys will not fit into memory you can do this: On the cooked data I tested, I got the following timings: Gnu Sort: # time sort --temporary-directory=/opt data > sort1 real 0m24.698s user 0m22.539s sys 0m1.950s Perl Sort: # time perl sort.pl real 0m55.900s user 0m39.897s sys 0m6.430sThe data file I used had a wc of: #wc data 4915200 34406400 383385600 dataI am surprised that this Perl script is only half the speed of Gnu sort on this data. I think that on a bigger data set, with long lines, it might even be able to sort faster that Gnu Sort. UPDATE: Most of the time seems to be being spent in the output loop. All of the seeks seems to really slow things down.
-- gam3 A picture is worth a thousand words, but takes 200K.
In Section
Seekers of Perl Wisdom
|
|
||||||||||||||||||||||