Beefy Boxes and Bandwidth Generously Provided by pair Networks
more useful options
 
PerlMonks  

Re: Perl regexp matching is slow??

by dragonchild (Archbishop)
on Jan 30, 2007 at 04:17 UTC ( [id://597266]=note: print w/replies, xml ) Need Help??


in reply to Perl regexp matching is slow??

First off, algorithms can always be improved. There is generally a compromise between pure processing speed and many other considerations, such as RAM usage and code maintainability. Perl generally sacrifices RAM for speed, but generally considers speed and maintainability to be about equal (knocks on the codebase aside).

Second, you must understand that Perl set the bar for regexes when P5 was released over 10 years ago. There's a reason why the primary C library for regexes is called "pcre", or "Perl-compatible regex engine". In that time, a lot of theoretical work has been done. Not all that work has been put into the current engine, for many reasons such as:

  • The change doesn't provide enough for the risk of the making the change
  • The change breaks a feature that has to work
  • The change hasn't been proven to do what it's claimed to do
Remember - every script written for Perl waaaay back to 1.0.0 is still executable in 5.8.8 - backwards compatibility is a major concern for p5p. Also, the number of people who both have the understanding of the Perl engine and the necessary time to work on it are few and far-between. And, honestly, many of them are hard at work fixing bugs, adding other features, and working on Perl6.

And, lastly - fast is as fast does. Perl is "fast enough" for me and my clients, and that includes the regex engine. I'm not a baremetal speed freak.


My criteria for good software:
  1. Does it work?
  2. Can someone else come in, make a change, and be reasonably certain no bugs were introduced?

Replies are listed 'Best First'.
Re^2: Perl regexp matching is slow??
by Aristotle (Chancellor) on Feb 07, 2007 at 09:34 UTC

    Second, you must understand that Perl set the bar for regexes when P5 was released over 10 years ago. […] In that time, a lot of theoretical work has been done.

    You missed the part where the author writes he will be reviewing “a regular expression search algorithm invented by Ken Thompson in the mid-1960s.”

    Makeshifts last the longest.

      No, I didn't miss that. New code is always being written to implement old algorithms. Just because an algorithm exists doesn't mean that there is an efficient implementation of it. Theoretical work has advanced in the implementation of these algorithms and in how the various algorithms can be used to solve the problems Perl's regexes need to solve.

      My criteria for good software:
      1. Does it work?
      2. Can someone else come in, make a change, and be reasonably certain no bugs were introduced?
        I guess you missed the part where it was used to write UNIX's lex(1) and grep(1), too.
Re^2: Perl regexp matching is slow??
by Anonymous Monk on Sep 30, 2009 at 18:36 UTC
    You're missing the statistics there. The P.C.R.E. approach takes twenty seconds for a?30a30, a reasonably simple regex, and continues to slow exponentially for longer patterns. That's not ‘fast enough’ by any standard.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://597266]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others avoiding work at the Monastery: (7)
As of 2024-04-23 18:39 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found