in reply to Regexp do's and don'ts

You should recommend people avoid constructs like:  [Jj][Aa][Vv][Aa] as they are quite inefficient and also can blow out various optimizations just by their presence. Its better to write that (?i:Java). Also up until 5.9.2 perl doesnt optimise alternations very well so its advisable to use modules like Regexp::List or the like to preprocess

. OTOH as of 5.9.2 perl _does_ optimize them so using things like Regexp::List will only slow down your patterns (im hopeful by 5.10 these modules will be updated to Do The Right Thing Regardless™).

In fact if at all possible after that version it is recommended that you use alternations instead of using quantifier, bracketing. Ie,

will be more efficent that
as of 5.9.2, and in some circumstance massively more efficient.

I admit i wrote the optimization so im tooting my own horn here a bit. :-) But it is worth realizing that alternations in later perls can be signifigantly faster than other hypothetically equivelent patterns.


Replies are listed 'Best First'.
Re^2: [Try-out] Regexp do's and don'ts
by muba (Priest) on Mar 28, 2005 at 11:24 UTC
    I am fully aware of the fact that m/[Jj][Aa][Vv][Aa]/ sucks like hell. I just needed a "complex" regex which had a clear goal, in order to demonstrate multi-lining regexes. But I'll add a note: don't try this at home :)

    "2b"||!"2b";$$_="the question"
    Besides that, my code is untested unless stated otherwise.
    One more: please review the article about regular expressions (do's and don'ts) I'm working on.