Beefy Boxes and Bandwidth Generously Provided by pair Networks
Keep It Simple, Stupid

Re: How do I optimize a regular expression?

by moritz (Cardinal)
on Dec 07, 2009 at 16:29 UTC ( #811538=note: print w/ replies, xml ) Need Help??

in reply to How do I optimize a regular expression?

There are a few things you can try:

(?:thing){0} seems to be a poor man's comment. I'd use an ordinary #...\n comment instead, or (?#...) comments.

Second thing: split on whitespaces as far as possible, and then check the individual fields with anchored regexes

You can also use (?>...) non-backtracking groups for things that don't have to backtrack. That will make regexes fail faster if they can't match.

If you have the choice, use perl 5.10 or newer, it has a an awesome optimization for alternatives of literals.

Comment on Re: How do I optimize a regular expression?
Select or Download Code
Replies are listed 'Best First'.
Re^2: How do I optimize a regular expression?
by kyle (Abbot) on Dec 07, 2009 at 16:50 UTC

    You're correct about my poor man's comments. I didn't know about (?# ... ) comments or I'd have used that. Thanks!

    To split on white space might be a good idea for this application, and I may try that, but I'm more interested in how to make expressions work faster.

    I don't really understand how backtracking works, so I don't know when (?> ... ) can be used. I probably ought to spend a day in perlre or something.

    I'm actually doing my development with Perl 5.10, but the machine it ultimately has to run on has only 5.8.8. Now I'm really wondering how hard it would be to upgrade it.

    Thanks for your help!

      but I'm more interested in how to make expressions work faster

      Then I'll try to give you a few general hints:

      • Learn about backtracking, and make sure you avoid it wherever possible
        • Anchor your regexes if possible
        • Try to avoid .*? and .*
        • Use backtracking-suppressing groups whenever possible
      • Try to use literal strings where possible. The regex engine is smart enough to anchor them automatically as an optimization (in certain cases)
      • Only capture (with (...)) when you actually need it

      Regexp::Assemble promises (among other things) to bring the power of trie optimizations to earlier perls, maybe it's worth a try (and less hassle than updating your perl version).

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://811538]
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others musing on the Monastery: (4)
As of 2016-04-30 08:57 GMT
Find Nodes?
    Voting Booth?
    :nehw tseb si esrever ni gnitirW

    Results (441 votes). Check out past polls.