Beefy Boxes and Bandwidth Generously Provided by pair Networks
Don't ask to ask, just ask
 
PerlMonks  

Re^3: all occurences of a target word by regular expression

by moritz (Cardinal)
on Mar 30, 2010 at 14:58 UTC ( #831870=note: print w/ replies, xml ) Need Help??


in reply to Re^2: all occurences of a target word by regular expression
in thread all occurences of a target word by regular expression

The combinatorial approach has not much to do with perl, but I can try anyway.

You can for example walk through the string, and for all 'a's count the number of 'b's that are also followed by 'c's, counting their number.

So for a target string like 'a b  a c b c c' you find that for the first 'a' you have 1 b followed by 3 c's, and one b followed by 2 c's, which sums up to 5.

For the second 'a' you just have one 'b' followed by two 'c's, so all in all you have 7 possible matches.

Back to Perl, the regex module I mentioned above doesn't really do any work - it just exploits a feature of the perl built-in regex engine. It causes each match to fail, so it forces the regex engine to backtrack into other alternatives.

This small piece demonstrates that:

$ perl -wle '"abacbcc" =~ /a.*b.*c(?{ $count++ })(?!)/; print $count' 7

(?!) is just a "clever" way to write a regex that never matches (on perl 5.10 or newer you can also write (*FAIL) to achieve the same thing, but more readable).

The (?{...}) is just a block of perl code that regex engine runs after it matched the c but before it failed.


Comment on Re^3: all occurences of a target word by regular expression
Select or Download Code

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://831870]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others taking refuge in the Monastery: (7)
As of 2014-12-18 04:03 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    Is guessing a good strategy for surviving in the IT business?





    Results (41 votes), past polls