Don't ask to ask, just ask | |
PerlMonks |
Re: Re: Ovid, Long Live .*? (dot star question-mark)by danger (Priest) |
on Sep 06, 2001 at 05:46 UTC ( [id://110472]=note: print w/replies, xml ) | Need Help?? |
However, *capturing* just a non-greedy dot star will still suffer from having to test the remaining pattern (outside of the parens) at each step. Thus, the negated character class will perform a lot better in the following:
However, both approaches *express* different things (they just happen to functionally coincide in the above). For some things, .*? is the right approach, for others, a negated character class is the right approach. And, to add to japhy's additional warning regarding the stricter meaning of a negated character class, I'll offer another example. For those who do not see the potential difference in meaning and use of each approach, consider the following contrived example: I want to match (and extract) the first two fields of colon separated data, but only when the third field starts with an 'A' (let's not worry about whether split() would be a better approach for a minute):
The non-greedy DS version doesn't work according the spec (only the first two lines have an 'A' in the 3rd field). That's because dot star part in (.*?): does not say "match only up to the next colon" (as some people occassionally believe it does), it says: "match as few (of *any* characters1) as we can and still have the remainder of the expression match". When the whole pattern is (.*?):, the end result (aside from efficiency) is the same --- but if the pattern that follows is more than a single character, things are not at all the same as a negated character class. I only wanted to reiterate this because I've often seen beginners and more experienced programmer's make the mistake of thinking that the non-greedy dot star and a negated character class are interchangeable, and they simply aren't.
In Section
Meditations
|
|