Beefy Boxes and Bandwidth Generously Provided by pair Networks
Just another Perl shrine

Re: Using Look-ahead and Look-behind

by jds17 (Pilgrim)
on May 07, 2009 at 16:13 UTC ( #762651=note: print w/replies, xml ) Need Help??

in reply to Using Look-ahead and Look-behind

Thank you for your very nice article, I certainly learned some new tricks!

Just one little comment: The code in the last paragraph did not work because by default regular expressions are greedy. (Did this change with the Perl versions in between?) The only right-substring that comes out is the full string:

$_ = "Hello"; print "$1\n" while /(?=(.*))/g;
Making the "(.*)" part non-greedy fixes it (in Perl 5.10):
$_ = "Hello"; print "$1\n" while /(?=(.*)?)/g;
Hello ello llo lo o

Replies are listed 'Best First'.
Regex bug in 5.10 (was: Using Look-ahead and Look-behind)
by Roy Johnson (Monsignor) on May 08, 2009 at 14:39 UTC
    I think you have found a bug in 5.10's regex handling. The lookahead's greediness or non-greediness should not matter, because it does not consume any characters. When used in a global match, patterns that do not consume characters should advance one character on each match. At least that's how I read the documentation.

    The really interesting thing about your version is that you didn't make the capture non-greedy, you made it optional. You probably meant (.*?), which (in pre-5.10) will output empty strings every time. I haven't installed 5.10 myself, so I can't play with it right now.

    Caution: Contents may have been coded under pressure.
      ...which (in pre-5.10) will output empty strings every time.

      With 5.10.0, /(?=(.*?))/g; outputs one empty string.  And I can confirm the behavior reported by jds17 with /(?=(.*))/g.

      You are right, my change did not affect greediness. The bad thing is: now I don't understand why my proposed solution worked at all. Maybe someone can explain? I don't think the question is too important, but I like to use regular expressions and it bugs me a little if I cannot understand one (especially such a tiny one).

      I have read the documentation you have cited and it helped, so I played around some more and tried out the following, which only exchanges "+" for "*" in your original expression, really works as one would think and therefore would be my preferred solution, at least for Perl 5.10:

      $_ = "Hello"; print "$1\n" while /(?=(.+))/g;
      Hello ello llo lo o

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://762651]
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others making s'mores by the fire in the courtyard of the Monastery: (4)
As of 2020-08-04 17:59 GMT
Find Nodes?
    Voting Booth?
    Which rocket would you take to Mars?

    Results (33 votes). Check out past polls.