Beefy Boxes and Bandwidth Generously Provided by pair Networks
Think about Loose Coupling
 
PerlMonks  

Re: alternation in regexes: to use or to avoid?

by Athanasius (Monsignor)
on Dec 10, 2012 at 15:07 UTC ( #1008113=note: print w/ replies, xml ) Need Help??


in reply to alternation in regexes: to use or to avoid?

Perhaps the following quote from the Camel Book will shed some light on this question:

Short-circuit alternation is often faster than the corresponding regex. So:

print if /one-hump/ || /two/;

is likely to be faster than:

print if /one-hump|two/;

at least for certain values of one-hump and two. This is because the optimizer likes to hoist certain simple matching operations up into higher parts of the syntax tree and do very fast matching with a Boyer-Moore algorithm. A complicated pattern tends to defeat this.
— Tom Christiansen, brian d foy & Larry Wall with Jon Orwant, Programming Perl (4th Edition, 2012), p. 692.

Hope that helps,

Athanasius <°(((><contra mundum Iustus alius egestas vitae, eros Piratica,


Comment on Re: alternation in regexes: to use or to avoid?
Select or Download Code
Re^2: alternation in regexes: to use or to avoid?
by dk (Chaplain) on Dec 10, 2012 at 15:14 UTC
    Not really, because it says:

    A complicated pattern tends to defeat this.

    and i'm seeing exactly the opposite. I wish Tom would comment on that :) But thank you for the quote, it helps with understanding why I think that the observed behavior is bad.

      Perhaps read "complicated" as "non-trivial", EG: having alternations

        Please read the benchmark numbers.

        Alternation is MUCH faster than looping over trivial regexes, except when you use captures inside the alternations.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://1008113]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others browsing the Monastery: (7)
As of 2014-11-22 11:55 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    My preferred Perl binaries come from:














    Results (121 votes), past polls