Re: Perl regexp matching is slow??

First off, algorithms can always be improved. There is generally a compromise between pure processing speed and many other considerations, such as RAM usage and code maintainability. Perl generally sacrifices RAM for speed, but generally considers speed and maintainability to be about equal (knocks on the codebase aside).

Second, you must understand that Perl set the bar for regexes when P5 was released over 10 years ago. There's a reason why the primary C library for regexes is called "pcre", or "Perl-compatible regex engine". In that time, a lot of theoretical work has been done. Not all that work has been put into the current engine, for many reasons such as:

The change doesn't provide enough for the risk of the making the change
The change breaks a feature that has to work
The change hasn't been proven to do what it's claimed to do

Remember - every script written for Perl waaaay back to 1.0.0 is still executable in 5.8.8 - backwards compatibility is a major concern for p5p. Also, the number of people who both have the understanding of the Perl engine and the necessary time to work on it are few and far-between. And, honestly, many of them are hard at work fixing bugs, adding other features, and working on Perl6.

And, lastly - fast is as fast does. Perl is "fast enough" for me and my clients, and that includes the regex engine. I'm not a baremetal speed freak.

My criteria for good software:

Does it work?
Can someone else come in, make a change, and be reasonably certain no bugs were introduced?

Comment on Re: Perl regexp matching is slow??

Replies are listed 'Best First'.
Re^2: Perl regexp matching is slow?? by Aristotle (Chancellor) on Feb 07, 2007 at 09:34 UTC
Second, you must understand that Perl set the bar for regexes when P5 was released over 10 years ago. […] In that time, a lot of theoretical work has been done. You missed the part where the author writes he will be reviewing “a regular expression search algorithm invented by Ken Thompson in the mid-1960s.” Makeshifts last the longest.	[reply]
Re^3: Perl regexp matching is slow?? by dragonchild (Archbishop) on Feb 07, 2007 at 14:46 UTC
No, I didn't miss that. New code is always being written to implement old algorithms. Just because an algorithm exists doesn't mean that there is an efficient implementation of it. Theoretical work has advanced in the implementation of these algorithms and in how the various algorithms can be used to solve the problems Perl's regexes need to solve. My criteria for good software: Does it work? Can someone else come in, make a change, and be reasonably certain no bugs were introduced?	[reply]
Re^4: Perl regexp matching is slow?? by Anonymous Monk on Mar 09, 2009 at 18:18 UTC
I guess you missed the part where it was used to write UNIX's lex(1) and grep(1), too.	[reply]
Re^2: Perl regexp matching is slow?? by Anonymous Monk on Sep 30, 2009 at 18:36 UTC
You're missing the statistics there. The P.C.R.E. approach takes twenty seconds for a?³⁰a³⁰, a reasonably simple regex, and continues to slow exponentially for longer patterns. That's not ‘fast enough’ by any standard.	[reply]


more useful options
	PerlMonks