Beefy Boxes and Bandwidth Generously Provided by pair Networks DiBona
Don't ask to ask, just ask
 
PerlMonks  

Re: Searching pattern in 400 files and getting count out of each file

by Athanasius (Prior)
on Nov 08, 2012 at 07:21 UTC ( #1002828=note: print w/ replies, xml ) Need Help??


in reply to Searching pattern in 400 files and getting count out of each file

Hello Rita_G, and welcome to the Monastery!

Here is some good advice from the Camel Book (4th edition, p. 696):

Avoid unnecessary syscalls. ... Avoid unnecessary system calls. ... Worry about starting subprocesses, but only if they’re frequent.

The performance problems you are seeing almost certainly derive from the frequent use of backticks in your script. Each such use incurs an additional overhead.

The good news is that all the backtick operations in your script can be replaced with pure Perl. See grep, File::Basename, and the substitution operator s/// in Regexp Quote Like Operators and perlre.

Don’t even think about multithreading until you’ve re-implemented your script in pure Perl and benchmarked the results!

Hope that helps,

Athanasius <°(((><contra mundum


Comment on Re: Searching pattern in 400 files and getting count out of each file
Re^2: Searching pattern in 400 files and getting count out of each file
by space_monk (Chaplain) on Nov 08, 2012 at 10:31 UTC
    The original posting also (through the grep) reads each file for every pattern. i.e. it opens every single file (400 of them) 8000 or so times. Look at strategies to just read each file once; another reply on this thread seems to have implicitly done this, without clarifying why, and it also misses the opportunity to tidy up the patterns outside the file read loop.
    A Monk aims to give answers to those who have none, and to learn from those who know more.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://1002828]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others musing on the Monastery: (9)
As of 2014-04-25 02:04 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    April first is:







    Results (579 votes), past polls