Here is my string: "
A1234567890A". Here is my
regex:
m{
A # A
(?:
[^AB]* # 0 or more non-A and non-B characters
. B # any character, then a B
)* # this combo, 0 or more times
[^AB]* # 0 or more non-A and non-B characters
A # an A
}x
Now, before you ask, this is related to a) unrolling the
loop, and b) reversing a regex. But it's also a very isolated
case of the regex engine being a ninny.
This what I personally think the regex engine should do:
BEFORE & AFTER REGEX
<> <A01234567890A> A
<A> <01234567890A> [^AB]*
<A01234567890> <A> .
<A01234567890A> <> B FAILED
<A0123456789> <0A> .
<A01234567890> <A> B FAILED
At this point, Perl should NOT try to do:
<A012345678> <90A> .
<A0123456789> <0A> B FAILED
since Perl should KNOW that '0' was matched by
[^AB]*, so it can't POSSIBLY match
B.
Instead, Perl should realize it should give up, and continue:
<A01234567890> <A> [^AB]*
<A01234567890> <A> A
<A01234567890A> <> FINISHED
This is NOT the case. Perl zips ALL the way back to the
first 0 in the string, trying to match
.B until it
is exhausted, and goes back to the '...890' having been
matched by
[^AB]*, and it goes to the
[^AB]* outside the
(?:...)*. This
matches nothing, and then the 'A' matches.
My gripe is that Perl should know that if something COULD be
matched by
[^AB]*, then it CAN'T match 'B'.
$_="goto+F.print+chop;\n=yhpaj";F1:eval