Beefy Boxes and Bandwidth Generously Provided by pair Networks
We don't bite newbies here... much
 
PerlMonks  

Regex Greed

by temporal (Pilgrim)
on Aug 07, 2012 at 20:48 UTC ( #986082=perlquestion: print w/ replies, xml ) Need Help??
temporal has asked for the wisdom of the Perl Monks concerning the following question:

I want to find all regex matches within a string. The problem is that sometimes parts of one match are parts of another and I think this is throwing things off.

Example:

$test = "xTx\nxxTxxT"; $rx = 'x...T'; @matches = $test =~ /$rx/sg; printf "match #%i:\n%s\n",++$i,$_ for @matches;

The code above finds only 1 match when there are actually 2 matches within the string. Perl grabs the first match "x\nxxT" and then I think it starts looking for a new match where that last one ended. What I'd like to do is also get the other match - "xTxxT".

Is there any way to get Perl to check the entire string for each greedy regex pass excluding any previously found patterns?

Comment on Regex Greed
Download Code
Re: Regex Greed
by jwkrahn (Monsignor) on Aug 07, 2012 at 21:12 UTC
    $ perl -e' my $test = "xTx\nxxTxxT"; my $rx = qr/(?=(x...T))/s; my @matches = $test =~ /$rx/g; print "match #", $_ + 1, ":\n$matches[$_]\n" for 0 .. $#matches; ' match #1: x xxT match #2: xTxxT
      To add a little detail, jwkrahn is using a look ahead so the actual match itself is zero-width. See Looking ahead and looking behind in perlretut.

      #11929 First ask yourself `How would I do this without a computer?' Then have the computer do it the same way.

        Unfortunately, the referenced section does not discuss the zero-width-lookahead-to-a-capture trick of jwkrahn's solution. Does anyone know where this is covered in the standard docs (as opposed to a PerlMonks node)?

Re: Regex Greed
by temporal (Pilgrim) on Aug 07, 2012 at 21:45 UTC

    Thanks guys, exactly what I was looking for. Always wondered when I'd have to come back and read the regex docs more closely.

    Strange things are afoot at the Circle-K.

      Keep re-reading them.   You will always learn something new, and you will never be disappointed or feel that you have wasted your time.

        Keep re-reading them.

        Yea and amen to that, brother! And very often the 'new' thing you learn will be something you forgot five minutes after the last time you read it.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://986082]
Approved by toolic
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others musing on the Monastery: (9)
As of 2014-12-26 10:24 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    Is guessing a good strategy for surviving in the IT business?





    Results (171 votes), past polls