|
|
| No such thing as a small change | |
| PerlMonks |
Re: Split a file based on columnby space_monk (Hermit) |
| on Jan 17, 2013 at 10:53 UTC ( #1013753=note: print w/ replies, xml ) | Need Help?? |
|
All of the above answers seem to have problems with possible filehandle limits; personally I would read the entire file and convert it to a hash of arrays, and then write each array out to a file indicated by the array key. This has the advantage that only one file is open at any time. I will stick my neck out and say it will also be faster due to less file I/O As a second comment, you should use something like Text::CSV to get the data, but if you want it quick and dirty there's a good argument for using split instead of a regex here. Amount of Data: 300k rows = 64k per row = approx 19.6GB of data may cause problems, so maybe a compromise is to write the data when an array gets to a certain size. The following (untested/debugged) shows the idea...it assumes you specify the file(s) you want to read from on the command line. Update: Changed when it writes to file as a result of a davido comment
A Monk aims to give answers to those who have none, and to learn from those who know more.
In Section
Seekers of Perl Wisdom
|
|
||||||||||||||||||||||||||