Beefy Boxes and Bandwidth Generously Provided by pair Networks
XP is just a number
 
PerlMonks  

Re^3: Help with speeding up regex

by davido (Cardinal)
on Aug 14, 2012 at 09:08 UTC ( [id://987303]=note: print w/replies, xml ) Need Help??


in reply to Re^2: Help with speeding up regex
in thread Help with speeding up regex

ActiveState has Devel::NYTProf in its ppm4 repository here: http://code.activestate.com/ppm/Devel-NYTProf/. Once installed, you would cd into the target script's directory and execute a one-liner that invokes your script: perl -d:NYTProf some_perl.pl input_file.txt. And after it completes, you can review the results by executing the following statement: nytprofhtml --open (while still in the same directory). You should get a browser window with more useful information than you can shake a stick at.

My optimized regex is going to help as an optimization of the exact regex you provided. But it's tricky to implement and maintain as your needs continue to evolve. A better solution would be to use threads, or to fork processes. BrowserUk already had some suggestions on how you might implement such a strategy. The beauty of that sort of approach is that you don't have to concern yourself quite as much with how efficient the regular expressions themselves are because you're processing several files in parallel.

If you end up with a ton of data every day that has to get chewed through before tomorrow, you might look into a Map-Reduce strategy such as with hadoop.


Dave

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://987303]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others studying the Monastery: (5)
As of 2024-04-24 03:49 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found