http://www.perlmonks.org?node_id=1002828


in reply to Searching pattern in 400 files and getting count out of each file

Hello Rita_G, and welcome to the Monastery!

Here is some good advice from the Camel Book (4th edition, p. 696):

Avoid unnecessary syscalls. ... Avoid unnecessary system calls. ... Worry about starting subprocesses, but only if they’re frequent.

The performance problems you are seeing almost certainly derive from the frequent use of backticks in your script. Each such use incurs an additional overhead.

The good news is that all the backtick operations in your script can be replaced with pure Perl. See grep, File::Basename, and the substitution operator s/// in Regexp Quote Like Operators and perlre.

Don’t even think about multithreading until you’ve re-implemented your script in pure Perl and benchmarked the results!

Hope that helps,

Athanasius <°(((><contra mundum

Replies are listed 'Best First'.
Re^2: Searching pattern in 400 files and getting count out of each file
by space_monk (Chaplain) on Nov 08, 2012 at 10:31 UTC
    The original posting also (through the grep) reads each file for every pattern. i.e. it opens every single file (400 of them) 8000 or so times. Look at strategies to just read each file once; another reply on this thread seems to have implicitly done this, without clarifying why, and it also misses the opportunity to tidy up the patterns outside the file read loop.
    A Monk aims to give answers to those who have none, and to learn from those who know more.