Beefy Boxes and Bandwidth Generously Provided by pair Networks
The stupid question is the question not asked
 
PerlMonks  

Re: Using Look-ahead and Look-behind

by jds17 (Pilgrim)
on May 07, 2009 at 16:13 UTC ( #762651=note: print w/ replies, xml ) Need Help??


in reply to Using Look-ahead and Look-behind

Thank you for your very nice article, I certainly learned some new tricks!

Just one little comment: The code in the last paragraph did not work because by default regular expressions are greedy. (Did this change with the Perl versions in between?) The only right-substring that comes out is the full string:

$_ = "Hello"; print "$1\n" while /(?=(.*))/g;
Output:
Hello
Making the "(.*)" part non-greedy fixes it (in Perl 5.10):
$_ = "Hello"; print "$1\n" while /(?=(.*)?)/g;
Output:
Hello ello llo lo o


Comment on Re: Using Look-ahead and Look-behind
Select or Download Code
Replies are listed 'Best First'.
Regex bug in 5.10 (was: Using Look-ahead and Look-behind)
by Roy Johnson (Monsignor) on May 08, 2009 at 14:39 UTC
    I think you have found a bug in 5.10's regex handling. The lookahead's greediness or non-greediness should not matter, because it does not consume any characters. When used in a global match, patterns that do not consume characters should advance one character on each match. At least that's how I read the documentation.

    The really interesting thing about your version is that you didn't make the capture non-greedy, you made it optional. You probably meant (.*?), which (in pre-5.10) will output empty strings every time. I haven't installed 5.10 myself, so I can't play with it right now.


    Caution: Contents may have been coded under pressure.
      ...which (in pre-5.10) will output empty strings every time.

      With 5.10.0, /(?=(.*?))/g; outputs one empty string.  And I can confirm the behavior reported by jds17 with /(?=(.*))/g.

      You are right, my change did not affect greediness. The bad thing is: now I don't understand why my proposed solution worked at all. Maybe someone can explain? I don't think the question is too important, but I like to use regular expressions and it bugs me a little if I cannot understand one (especially such a tiny one).

      I have read the documentation you have cited and it helped, so I played around some more and tried out the following, which only exchanges "+" for "*" in your original expression, really works as one would think and therefore would be my preferred solution, at least for Perl 5.10:

      $_ = "Hello"; print "$1\n" while /(?=(.+))/g;
      Output:
      Hello ello llo lo o

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://762651]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others pondering the Monastery: (8)
As of 2015-07-08 02:34 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    The top three priorities of my open tasks are (in descending order of likelihood to be worked on) ...









    Results (93 votes), past polls