http://www.perlmonks.org?node_id=831870


in reply to Re^2: all occurences of a target word by regular expression
in thread all occurences of a target word by regular expression

The combinatorial approach has not much to do with perl, but I can try anyway.

You can for example walk through the string, and for all 'a's count the number of 'b's that are also followed by 'c's, counting their number.

So for a target string like 'a b  a c b c c' you find that for the first 'a' you have 1 b followed by 3 c's, and one b followed by 2 c's, which sums up to 5.

For the second 'a' you just have one 'b' followed by two 'c's, so all in all you have 7 possible matches.

Back to Perl, the regex module I mentioned above doesn't really do any work - it just exploits a feature of the perl built-in regex engine. It causes each match to fail, so it forces the regex engine to backtrack into other alternatives.

This small piece demonstrates that:

$ perl -wle '"abacbcc" =~ /a.*b.*c(?{ $count++ })(?!)/; print $count' 7

(?!) is just a "clever" way to write a regex that never matches (on perl 5.10 or newer you can also write (*FAIL) to achieve the same thing, but more readable).

The (?{...}) is just a block of perl code that regex engine runs after it matched the c but before it failed.