|Problems? Is your data what you think it is?|
Re: Puzzled by regexby Don Coyote (Monk)
|on Apr 10, 2013 at 08:18 UTC||Need Help??|
Random thoughts on match operating efficiency
The difference between the maximally matched quantifier (.+) - greedy, and the minimally matched quantifier (.+?) - nongreedy, in the case of the +(1 or more) quantifier is what is matched but more importantly, how, or from where, it is matched
In the maximal case the match position begins from eol and backtracks a position at a time and checks for the match, repeating till success or starting match position is reached
In the nongreedy case the operator match position starts from the starting match postion and forward-tracks a character at a time until success or eol
application of + quantifier behaviour to ? quantifier behaviour:
applying this to the ?(0 or 1) quantifier, I would expect the matching start position differs in the case of a greedy match starting at 1 position ahead, and in the nongreedy case starting at the starting match position.
The difference is not in what is matched, but how, or from where, the matching starts. This effectively increases the nongreedy match efficiency by the reduction of one jump ahead operation per usage.
I would imagine this will have been internally optimised, unless (or even especially if) there is perhaps a security benefit of a look forward match opposed to a look behind match
update later the same day
crumbs, +(0 or 1) quantifier, well that is incorrect. This '+' is the (1 or more) quantifier.
ok so to fix the above example i have replaced the '*' quantifiers with '+' quantifiers. And I have replaced the '+' quantifiers with '?' quantifiers, so at least what I wrote makes sense. Which it does despite the syntax errors now rectified.
After attempting to provide some examples where differences would be found, between the default greedy and nongreedy behaviour indicated by a secondary '?' quantifier, I realised that you are right, there are no differences in what is matched, when the '\n' are included, and in agreement with davidos and my own response, being the difference is in how the match is carried out.