laziness, impatience, and hubris | |
PerlMonks |
Re: to process big text file, use java or perl?by sundialsvc4 (Abbot) |
on Jul 11, 2014 at 17:44 UTC ( [id://1093274]=note: print w/replies, xml ) | Need Help?? |
Well, first of all, “1 gigabyte,” by today’s standards, is not particularly “big.” My laptop can slurp 16 times that amount of data into its RAM. But, secondly, it really all depends on how you go about it. For instance, you probably don’t want to, and certainly don’t need to, suck-up all of that data all at once. You probably want to process it a line at a time. And even if the structure of a particular file is not such that you can break it sensibly into “lines,” it most certainly possesses some kind of internal structure that will enable you to read selected portions of it into memory. Therefore, files of arbitrary size can be processed, and it does not matter in the slightest which programming-language you use. And yet, these days, and with both languages, that’s somewhat beside-the-point. You want to find a way to do as little original work as possible ... by standing on the shoulders of giants. “Do not do a thing already done, whatever it is.” Both Java and Perl have a rich complement of third-party modules ... of “stuff that you didn’t(!) have to write and debug ... to help you along with whatever-it-is that you happen to be doing. Therefore, this is where you should focus a lot of your attention. Perl’s library is called CPAN, and Java actually has several libraries in common use. The size of the file really does not matter. Isn’t there an existing library that will help you with this? Might there be one which exactly-matches the description of this (as it turns out, not so unique ...) task that you have been assigned? You need to determine this. “So, which one?” Well, if you are already more-familiar with one, you probably want to go with that one. And if not, look at the preceding paragraph. If you are not yet sure how to proceed ... step back and do a little research. Investigate how you might most-efficiently get your job done given either of the two scenarios, without immediately committing to either one. Then, make the choice that is most-appropriate for you.
In Section
Seekers of Perl Wisdom
|
|