Come for the quick hacks, stay for the epiphanies. | |
PerlMonks |
Re: Analysis of Regular Expressionsby jffry (Hermit) |
on Mar 17, 2010 at 20:24 UTC ( [id://829265]=note: print w/replies, xml ) | Need Help?? |
Assumptions:
Write a daemon that continuously runs each regular expression against an ever growing list of files. The daemon updates a table, and each row contains 2 columns: the regular expression and the average line count. Your output program simply sorts the table on the average line count column. So it is quick in that regard. However, as the daemon runs each regular expression against more and more files, the ranking may change. Obviously, newly added regular expressions will have a more volatile rank compared to older ones that have been run against thousands of files. To combat this problem, you could determine a minimum file comparison quantity before the regular expression shows up in the table. For speed, you could have the daemon give priority to newly added regular expressions until their rank stabilizes. In fact, these two points should be configurable as tuning parameter of the daemon. What I like about this approach is that it throws all the theoretical junk out the door. Brute force can be ugly, but then again, the map is not the territory, and brute force reveals the territory.
In Section
Seekers of Perl Wisdom
|
|