|go ahead... be a heretic|
heuristic to detect (perl) codeby LanX (Chancellor)
|on Jan 19, 2013 at 08:28 UTC||Need Help??|
LanX has asked for the
wisdom of the Perl Monks concerning the following question:
I'm meditating about a regex based heuristic to roughly detect if a text paragraph (multilines delimited by '\n\n') is rather perl source code than normal text.
The best idea I had so far was: using regexes to count the line endings with ';' or '}' possibly followed with a '#' part.
Another to check the frequency of words starting with a sigil.
I'm not talking about a valid parser, just a fuzzy detector.
Any better ideas?
One use case could be a JS that checks the contents of a posting in the monastery and warns about missing <code> tags, offering to include them.
(I'm a bit tired of unreadable posts here, and all the following edit-considerations and replies)
PS: I'm not sure if this thread better belongs to PM-Discussions.