|Problems? Is your data what you think it is?|
Analysis of Regular Expressionsby PetaMem (Priest)
|on Mar 17, 2010 at 08:33 UTC||Need Help??|
PetaMem has asked for the
wisdom of the Perl Monks concerning the following question:
this one is certainly a little difficult/hairy/tricky and I'm aware, that there probably can't be a definitive answer. In fact, I had a hard time of even deciding whether to put this into Seekers or Meditations.
I'll need a function that will return either a "genericity" - or - "specifity" of a regular expression. The idea is, that according to a measure of "specifity", the following regular expressions are sorted from the most specific to the most generic:
So in other words, a regular expression is more specific than another, if it "matches less" than that other rx. (well - and there's the problem - it is questionable if 4) isn't more specific than 3) )
Any creative ideas how to achieve this? Oh - yes and the computation of the "genericity/specifity" should be fast. The best I could come up so far was to dissect each regular expression with other regular expressions and adding/subtracting "weight" at the occurence of different metacharacters with very limited success so far.
So if anyone has a good idea or maybe a pointer to something similar that has been done before, I'd love to see that.