Re: Suggestion for regular expression speed improvement.

in reply to Suggestion for regular expression speed improvement.

Corion is right with his suggestions. If you're still interested in how to speed up the regex, here it goes:

The first .+ will first match all characters, then gives up characters until the \t finds the first tab, then the second .+ has no more character to match, then the first .+ has to give up characters again etc.

To avoid all that backtracking, you should substitute each .+ by something that matches everything except tabulators, [^\t]+.

Comment on Re: Suggestion for regular expression speed improvement. Select or Download Code

Replies are listed 'Best First'.
Re^2: Suggestion for regular expression speed improvement. by bala.linux (Novice) on Jun 15, 2009 at 12:35 UTC
This sounds good. I will adopt this change and compare the performance. Thanks.	[reply]
Re^3: Suggestion for regular expression speed improvement. by moritz (Cardinal) on Jun 15, 2009 at 12:37 UTC
No, don't. Go with the tips Corion gave you above, it's much more sensible to use split or a module - my explanation was mostly to satisfy academic curiosity, and not meant as a suggestion on how to solve your problem.	[reply]
Re^4: Suggestion for regular expression speed improvement. by bala.linux (Novice) on Jun 15, 2009 at 13:03 UTC
As I mentioned above, I would not be able to take that approach. Since, I want to enable support to match log lines of any format with grouping. If I go by CSV, then I would not be able to parse other formatted logs like syslogs and other proprietary logs.	[reply]
Re^5: Suggestion for regular expression speed improvement. by demerphq (Chancellor) on Jun 15, 2009 at 14:25 UTC
Re^6: Suggestion for regular expression speed improvement. by moritz (Cardinal) on Jun 15, 2009 at 14:34 UTC
Re^6: Suggestion for regular expression speed improvement. by bala.linux (Novice) on Jun 15, 2009 at 14:46 UTC
Some notes below your chosen depth have not been shown here

In Section Seekers of Perl Wisdom