|Problems? Is your data what you think it is?|
My apologies for misunderstanding what problem you thought HTML::Parser would be a good fit for.
While I appreciate the compliment, I do think that it was functional programming which made this work.
If I am a strong programmer, my strength was shown here in picking the right style for the job, not in executing that style in an amazing way. I agree that your average Perl programmer would botch this job, probably horribly. I also submit that your average competent Perl programmer would also botch it - possibly subtly and probably not. I know that I personally wouldn't know how to tackle this in an OO style, and could not come up with a solution I would like in a procedural style.
However I think that most decent Perl programmers with exposure to functional techniques, when given the core function would have little difficulty in adding a series of handlers and getting it right from there on in. And that core function is easy to get right because it is doing something conceptually simple. (Scanning for handled stuff, and escaping anything that doesn't wind up being handled.) The barrier here is conceptual, not technical.
As for whether HTML::Parser is a good fit, that depends on how you read Ovid's problem. It would be a good fit if you wanted to just strip out disallowed tags. It would be a bad fit if you wanted to escape them again, leaving text untouched and add error messages as I did. It would be a really bad fit if this parsing piece was going to be extended (as the above was) to allow a number of custom markup symbols to be used.
Personally I always get irritated at seeing my mistakes be silent. So while HTML::Parser might solve some spec, it would not give a solution that could be extended nicely. And I am not sure it would solve Ovid's problem to the satisfaction of future requests that might come in. But this does.
More amusing handlers. The right one for :// can autodetect urls. One for @ can do the same for email addresses. The main loop needs to be changed to a different class, but "\n" gives you the ability to have a newbie mode. (Note, look for leading spaces on the next line and turn them into please.)
So you see, a ton of different requests can be satisfied, and when the rules conflict (eg don't look for URLs in formatted code) there is already a good resolution.
Looking at this, I don't think I could do all of that (particularly the switching) using any other technique I know. Without having seen functional, I would be lost.
In reply to RE (tilly) 4 (not html): Why I like functional programming