Beefy Boxes and Bandwidth Generously Provided by pair Networks
Keep It Simple, Stupid
 
PerlMonks  

Re^2: Regex Greed

by kennethk (Monsignor)
on Aug 07, 2012 at 21:41 UTC ( #986088=note: print w/ replies, xml ) Need Help??


in reply to Re: Regex Greed
in thread Regex Greed

To add a little detail, jwkrahn is using a look ahead so the actual match itself is zero-width. See Looking ahead and looking behind in perlretut.


#11929 First ask yourself `How would I do this without a computer?' Then have the computer do it the same way.


Comment on Re^2: Regex Greed
Re^3: Regex Greed
by AnomalousMonk (Monsignor) on Aug 07, 2012 at 22:55 UTC

    Unfortunately, the referenced section does not discuss the zero-width-lookahead-to-a-capture trick of jwkrahn's solution. Does anyone know where this is covered in the standard docs (as opposed to a PerlMonks node)?

      A search on “overlapping matches’ in perldoc doesn’t turn up anything relevant. However, I did find the following in the Camel Book (4th Edition, pages 247–8, underlining added):

      Lookahead assertions can be used to implement overlapping matches. For example,
      "0123456789" =~ /(\d{3})/g
      returns only three strings: 012, 345, and 678. By wrapping the capture group with a lookahead assertion:
      "0123456789" =~ /(?=(\d{3}))/g
      you now retrieve all of 012, 123, 234, 345, 456, 567, 678, and 789. This works because this tricky assertion does a stealthy sneakahead to run up and grab what’s there and stuff its capture group with it, but being a lookahead, it reneges and doesn’t technically consume any of it. When the engine sees that it should try again because of the /g, it steps one character past where last it tried.

      HTH,

      Athanasius <°(((><contra mundum

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://986088]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others musing on the Monastery: (5)
As of 2014-07-14 02:10 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    When choosing user names for websites, I prefer to use:








    Results (254 votes), past polls