Beefy Boxes and Bandwidth Generously Provided by pair Networks
Pathologically Eclectic Rubbish Lister

Re^3: all occurences of a target word by regular expression

by moritz (Cardinal)
on Mar 30, 2010 at 14:58 UTC ( #831870=note: print w/replies, xml ) Need Help??

in reply to Re^2: all occurences of a target word by regular expression
in thread all occurences of a target word by regular expression

The combinatorial approach has not much to do with perl, but I can try anyway.

You can for example walk through the string, and for all 'a's count the number of 'b's that are also followed by 'c's, counting their number.

So for a target string like 'a b  a c b c c' you find that for the first 'a' you have 1 b followed by 3 c's, and one b followed by 2 c's, which sums up to 5.

For the second 'a' you just have one 'b' followed by two 'c's, so all in all you have 7 possible matches.

Back to Perl, the regex module I mentioned above doesn't really do any work - it just exploits a feature of the perl built-in regex engine. It causes each match to fail, so it forces the regex engine to backtrack into other alternatives.

This small piece demonstrates that:

$ perl -wle '"abacbcc" =~ /a.*b.*c(?{ $count++ })(?!)/; print $count' 7

(?!) is just a "clever" way to write a regex that never matches (on perl 5.10 or newer you can also write (*FAIL) to achieve the same thing, but more readable).

The (?{...}) is just a block of perl code that regex engine runs after it matched the c but before it failed.

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://831870]
and all is quiet...

How do I use this? | Other CB clients
Other Users?
Others perusing the Monastery: (7)
As of 2018-03-18 15:57 GMT
Find Nodes?
    Voting Booth?
    When I think of a mole I think of:

    Results (230 votes). Check out past polls.