Perl: the Markov chain saw | |
PerlMonks |
Matching and nonmatching multiple regexps at onceby Yary (Pilgrim) |
on Apr 20, 2012 at 14:44 UTC ( [id://966203]=perlmeditation: print w/replies, xml ) | Need Help?? |
We all know how to match "this or that"- /this|that/ And we can match "this and not that" simply- /this/ && !/that/ Once in a while I'm faced with a library that wants me to provide a regexp, and I need it to test a "this and not that" pair of patterns. (If it's my own code, it will accept a coderef like sub {/this/ && !/that/} as a condition, and all is well...) One easy-to-explain workaround is to use a code-conditional pattern: m:^(?(?{/this/ && !/that/})|(?!)):or, more verbosely,
And it works. Still, as I was settling down one evening a few nights ago, I got to thinking of alternatives. We can use look-ahead assertions: /(?=.*this)(?!.*that)/swhich is fairly easy for me to read. It doesn't need an anchor, the sub-patterns will cause a match or fail at the first position- in fact the anchor slows it down since it is a useless extra operation in this regexp. The s modifier at the end is so the .* won't stop at a line break. Then I had a thought going back to alternation: /this|that/, and a class I had on logic. Another way of saying "this and that" is "not(not this or not that)". Which leads to /(?!(?!.*this)|(?=.*that))/sThere might be a better way of expressing it, but my time is limited... Speaking of time, I tossed them all into a benchmarking script, and ran them on the machine I'm typing from which runs perl 5.12.1, and am a little surprised by the results.
What surprises me is both how much faster the three separate matches 3calls is than any single-matching alternative, and how much slower the "code" regexp is. The quick /a/ && !/b/ is due to the optimizing that the simpler patterns can undergo, but I guess that's dwarfed by the overhead of setting up an eval-inside-a-regexp. And you, dear readers, have you any other favorite ways of saying "match this and not that" in a single regexp?
Back to
Meditations
|
|