Beefy Boxes and Bandwidth Generously Provided by pair Networks
more useful options

Re: Regex Greed

by jwkrahn (Monsignor)
on Aug 07, 2012 at 21:12 UTC ( #986087=note: print w/replies, xml ) Need Help??

in reply to Regex Greed

$ perl -e' my $test = "xTx\nxxTxxT"; my $rx = qr/(?=(x...T))/s; my @matches = $test =~ /$rx/g; print "match #", $_ + 1, ":\n$matches[$_]\n" for 0 .. $#matches; ' match #1: x xxT match #2: xTxxT

Replies are listed 'Best First'.
Re^2: Regex Greed
by kennethk (Abbot) on Aug 07, 2012 at 21:41 UTC
    To add a little detail, jwkrahn is using a look ahead so the actual match itself is zero-width. See Looking ahead and looking behind in perlretut.

    #11929 First ask yourself `How would I do this without a computer?' Then have the computer do it the same way.

      Unfortunately, the referenced section does not discuss the zero-width-lookahead-to-a-capture trick of jwkrahn's solution. Does anyone know where this is covered in the standard docs (as opposed to a PerlMonks node)?

        A search on “overlapping matches’ in perldoc doesn’t turn up anything relevant. However, I did find the following in the Camel Book (4th Edition, pages 247–8, underlining added):

        Lookahead assertions can be used to implement overlapping matches. For example,
        "0123456789" =~ /(\d{3})/g
        returns only three strings: 012, 345, and 678. By wrapping the capture group with a lookahead assertion:
        "0123456789" =~ /(?=(\d{3}))/g
        you now retrieve all of 012, 123, 234, 345, 456, 567, 678, and 789. This works because this tricky assertion does a stealthy sneakahead to run up and grab what’s there and stuff its capture group with it, but being a lookahead, it reneges and doesn’t technically consume any of it. When the engine sees that it should try again because of the /g, it steps one character past where last it tried.


        Athanasius <°(((><contra mundum

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://986087]
[LanX]: fun, last time I was in Britain people had problems with my name, just rarely they said "like the children TV star Rolf Harris?", this time most replied instantly "Rolf? like the pedophile Rolf Harris" ?
LanX is perlophile
[ambrus]: Corion: read https://metacpan. org/pod/release/ MLEHMANN/AnyEvent- 7.13/lib/AnyEvent. pm#SUPPLYING-YOUR- OWN-EVENT-MODEL- INTERFACE in that case

How do I use this? | Other CB clients
Other Users?
Others romping around the Monastery: (8)
As of 2016-12-08 12:03 GMT
Find Nodes?
    Voting Booth?
    On a regular basis, I'm most likely to spy upon:

    Results (141 votes). Check out past polls.