First off, algorithms can
always be improved. There is generally a compromise between pure processing speed and many other considerations, such as RAM usage and code maintainability. Perl generally sacrifices RAM for speed, but generally considers speed and maintainability to be about equal (knocks on the codebase aside).
Second, you must understand that Perl set the bar for regexes when P5 was released over 10 years ago. There's a reason why the primary C library for regexes is called "pcre", or "Perl-compatible regex engine". In that time, a lot of theoretical work has been done. Not all that work has been put into the current engine, for many reasons such as:
- The change doesn't provide enough for the risk of the making the change
- The change breaks a feature that has to work
- The change hasn't been proven to do what it's claimed to do
Remember - every script written for Perl waaaay back to 1.0.0 is still executable in 5.8.8 - backwards compatibility is a major concern for p5p. Also, the number of people who both have the understanding of the Perl engine and the necessary time to work on it are few and far-between. And, honestly, many of them are hard at work fixing bugs, adding other features, and working on Perl6.
And, lastly - fast is as fast does. Perl is "fast enough" for me and my clients, and that includes the regex engine. I'm not a baremetal speed freak.
My criteria for good software:
- Does it work?
- Can someone else come in, make a change, and be reasonably certain no bugs were introduced?