Beefy Boxes and Bandwidth Generously Provided by pair Networks
Clear questions and runnable code
get the best and fastest answer

Re: Suggestion for regular expression speed improvement.

by moritz (Cardinal)
on Jun 15, 2009 at 12:11 UTC ( #771629=note: print w/replies, xml ) Need Help??

in reply to Suggestion for regular expression speed improvement.

Corion is right with his suggestions. If you're still interested in how to speed up the regex, here it goes:

The first .+ will first match all characters, then gives up characters until the \t finds the first tab, then the second .+ has no more character to match, then the first .+ has to give up characters again etc.

To avoid all that backtracking, you should substitute each .+ by something that matches everything except tabulators, [^\t]+.

Replies are listed 'Best First'.
Re^2: Suggestion for regular expression speed improvement.
by bala.linux (Novice) on Jun 15, 2009 at 12:35 UTC
    This sounds good. I will adopt this change and compare the performance. Thanks.
      No, don't. Go with the tips Corion gave you above, it's much more sensible to use split or a module - my explanation was mostly to satisfy academic curiosity, and not meant as a suggestion on how to solve your problem.
        As I mentioned above, I would not be able to take that approach. Since, I want to enable support to match log lines of any format with grouping. If I go by CSV, then I would not be able to parse other formatted logs like syslogs and other proprietary logs.

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://771629]
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others romping around the Monastery: (6)
As of 2021-05-11 15:49 GMT
Find Nodes?
    Voting Booth?
    Perl 7 will be out ...

    Results (119 votes). Check out past polls.