Beefy Boxes and Bandwidth Generously Provided by pair Networks
There's more than one way to do things
 
PerlMonks  

Re: How do I optimize a regular expression?

by moritz (Cardinal)
on Dec 07, 2009 at 16:29 UTC ( #811538=note: print w/ replies, xml ) Need Help??


in reply to How do I optimize a regular expression?

There are a few things you can try:

(?:thing){0} seems to be a poor man's comment. I'd use an ordinary #...\n comment instead, or (?#...) comments.

Second thing: split on whitespaces as far as possible, and then check the individual fields with anchored regexes

You can also use (?>...) non-backtracking groups for things that don't have to backtrack. That will make regexes fail faster if they can't match.

If you have the choice, use perl 5.10 or newer, it has a an awesome optimization for alternatives of literals.


Comment on Re: How do I optimize a regular expression?
Select or Download Code
Re^2: How do I optimize a regular expression?
by kyle (Abbot) on Dec 07, 2009 at 16:50 UTC

    You're correct about my poor man's comments. I didn't know about (?# ... ) comments or I'd have used that. Thanks!

    To split on white space might be a good idea for this application, and I may try that, but I'm more interested in how to make expressions work faster.

    I don't really understand how backtracking works, so I don't know when (?> ... ) can be used. I probably ought to spend a day in perlre or something.

    I'm actually doing my development with Perl 5.10, but the machine it ultimately has to run on has only 5.8.8. Now I'm really wondering how hard it would be to upgrade it.

    Thanks for your help!

      but I'm more interested in how to make expressions work faster

      Then I'll try to give you a few general hints:

      • Learn about backtracking, and make sure you avoid it wherever possible
        • Anchor your regexes if possible
        • Try to avoid .*? and .*
        • Use backtracking-suppressing groups whenever possible
      • Try to use literal strings where possible. The regex engine is smart enough to anchor them automatically as an optimization (in certain cases)
      • Only capture (with (...)) when you actually need it

      Regexp::Assemble promises (among other things) to bring the power of trie optimizations to earlier perls, maybe it's worth a try (and less hassle than updating your perl version).

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://811538]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others contemplating the Monastery: (6)
As of 2014-12-25 07:01 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    Is guessing a good strategy for surviving in the IT business?





    Results (159 votes), past polls