Beefy Boxes and Bandwidth Generously Provided by pair Networks
Welcome to the Monastery
 
PerlMonks  

Re: Re: Re: Common Regex Gotchas

by chromatic (Archbishop)
on Nov 27, 2001 at 04:47 UTC ( #127677=note: print w/replies, xml ) Need Help??


in reply to Re: Re: Common Regex Gotchas
in thread Common Regex Gotchas

Why doesn't the engine continue backwards past the whitespace and look for a <\/tag> string?

Because the engine prefers the longest match that starts at the leftmost possible position. When it hits .*, it jumps all the way to the end of the string and then backtracks, trying to match the next necessary character. Because it's backtracking, it matches </tag> at the end of the string. That fits the pattern, so it doesn't continue backtracking to find a shorter match.

If, when creating the example string, I carraige return after the <\/tag>, there shouldn't be a whitespace to match on, right?

The /s flag allows the '.' token to match newlines. Adding the minimal token '?' avoids the jump-to-end-then-backtrack behavior. It works like you'd expect, trying to match as few characters as possible.

Does that clear it up? I've also touched up the formatting somewhat.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://127677]
help
Chatterbox?
[Corion]: ambrus: He isn't familiar with GIGO (or nIGO) yet?
[Corion]: Also, is it impossible in the general case, but doable in your specific case, maybe? I find that working through a counterexample usually makes people see the light
[Corion]: Uiiih! Let's Encrypt will start issuing wildcard certificates, that's cool!
[ambrus]: Corion: no, backwards. It's possible in the general case, but not in the specific case of bad data we have. Which is why it's harder to explain.
[marto]: that reminds me, to donate...
[ambrus]: If it was possible in this specific case, then the burden to argue for that would be his.

How do I use this? | Other CB clients
Other Users?
Others about the Monastery: (9)
As of 2017-12-12 13:10 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    What programming language do you hate the most?




















    Results (332 votes). Check out past polls.

    Notices?