Beefy Boxes and Bandwidth Generously Provided by pair Networks
Problems? Is your data what you think it is?

Re^3: (Ab)using the Regex Engine

by vr (Curate)
on May 26, 2020 at 15:51 UTC ( #11117288=note: print w/replies, xml ) Need Help??

in reply to Re^2: (Ab)using the Regex Engine
in thread (Ab)using the Regex Engine

I'd call it "abuse". My bet is this pattern of application is well-known and tolerated for the sake of critical mass of existing "cool examples of (ab)using re-engine", and therefore safe to use in the future :). Stand-alone (*F) is guaranteed to fail, there's no need to "force to backtrack" while staying in the same branch; and as there are no other branches in your example, the whole matching must have been optimized away. On the other hand, something like (?(?{CODE})(*F)), with CODE result depending on sub-matches so far, is legitimate use and another matter entirely, but not the case here.

The impression is, aforementioned tolerance goes as far as injection of (*F) makes (but not always) engine fail to fail early, which is funny.

my $match = qr[([ab]+)([ab]+)]; my $str = 'aba'; $str =~ /^ $match $ (?{ print "1: $1-$2\n" }) a /x; $str =~ /^ $match $ (?{ print "2: $1-$2\n" }) b /x; $str =~ /^ $match $ (?{ print "3: $1-$2\n" }) (*F) b /x; $str =~ /^ $match $ (?{ print "4: $1-$2\n" }) (*F) .. /x; __END__ 1: ab-a 1: a-ba 3: ab-a 3: a-ba

Replies are listed 'Best First'.
Re^4: (Ab)using the Regex Engine
by jo37 (Hermit) on May 26, 2020 at 17:55 UTC

    Probably my statement in Re^2: (Ab)using the Regex Engine about "use" vs. "abuse" was unclear and I should have quoted the relevant section from perlre:

    (*FAIL) (*F) (*FAIL:arg)
    This pattern matches nothing and always fails. It can be used to force the engine to backtrack. It is equivalent to (?!), but easier to read. In fact, (?!) gets optimised into (*FAIL) internally. You can provide an argument so that if the match fails because of this FAIL directive the argument can be obtained from $REGERROR. It is probably useful only when combined with (?{}) or (??{}).
    My point was that I realized that
    (?{CODE})(?!) (?{CODE})(*F)
    are documented as equivalently forcing the engine to backtrack and are just what I was looking for. I don't call this "abuse", but YMMV.

    The example with a character class was just historical and the one with (??{CODE}) was a result of my own ignorance.



Log In?

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://11117288]
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others studying the Monastery: (3)
As of 2021-09-17 19:45 GMT
Find Nodes?
    Voting Booth?

    No recent polls found