Process large text data in arrayby hankcoder (Scribe)
|on Mar 10, 2015 at 14:31 UTC||Need Help??|
hankcoder has asked for the wisdom of the Perl Monks concerning the following question:
Hi, I'm trying to find better solutions or at least if possible to reduce the speed of processing large text data which were read from file and put in array. I will try to put everything as simple as possible and post here the related codes.
Current test is on local machine Windows XP Pro running ActivePerl. Live system will be on Unix/Linux environment.
The text file format are line by line, not fixed length. Current test contain 300k of lines about 38MB. Line format sample are:
I tested using 2 ways of retrieving file content, both works very fast, in just about 4sec.
However, after read all data into memory, I need to process it from beginning to end once to get wanted data line based on given criteria, and finding total matches. Added this process, the total time it takes is about 37sec.
I'm not sure if this speed is normal, but if can reduce it, that is really great.
The codes use to process the array are here:
sub routine codes for converting line2rec
If I remarks out sub &line2rec($line); the speed reduced to 10sec. So I guess this sub codes can be further improved.
Any suggestions are much appreciated. Thanks.