Beefy Boxes and Bandwidth Generously Provided by pair Networks
Clear questions and runnable code
get the best and fastest answer
 
PerlMonks  

Re: Optimizing string searches

by johndageek (Hermit)
on Sep 05, 2008 at 18:50 UTC ( [id://709368]=note: print w/replies, xml ) Need Help??


in reply to Optimizing string searches

Rather vague description but perl may be better than os grep.

if files reside on multiple machines - run search on the seperate machines if possible.

if files reside on a single machine - process them locally.

do not open files across the network.

suggest:
create a list of log files
loop open log files
loop read current file
regex string1 (if match write ouput)
regex string2 (if match ...)
regex string3 (if match ...)
regex string4 (if match ...)
next record
next log file

assumes: will only parse the log files for these 4 strings. there will be no reason to search the same logfiles again for other strings.

Enjoy!
Dageek

Replies are listed 'Best First'.
Re^2: Optimizing string searches
by moritz (Cardinal) on Sep 05, 2008 at 18:58 UTC
    regex string1 (if match write ouput) regex string2 (if match ...) regex string3 (if match ...) regex string4 (if match ...)

    It's usually faster to build one regex with four alternations and match that instead of matching four single regexes against a string.

      Thanks Moritz!

      Enjoy!
      Dageek

Re^2: Optimizing string searches
by cool256 (Initiate) on Sep 05, 2008 at 20:54 UTC
    Thanks for all the suggestions.
    Indeed my question was a bit vague. Since the search strings may change at any given time, hardcoding the regex was not an option.
    Instead I generate a runtime perl file containing the regex on the fly from the search strings.
    This boosted the performance and I'm fairly happy with the results.

    Thanks again :)

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://709368]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others learning in the Monastery: (5)
As of 2024-09-07 21:48 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found

    Notices?
    erzuuli‥ 🛈The London Perl and Raku Workshop takes place on 26th Oct 2024. If your company depends on Perl, please consider sponsoring and/or attending.