|Perl: the Markov chain saw|
HTML::Entities - encode all non-alphanumeric and foreign chars?by punch_card_don (Curate)
|on Sep 23, 2007 at 19:03 UTC||Need Help??|
punch_card_don has asked for the
wisdom of the Perl Monks concerning the following question:
I have your classic MySQL DB of user-profile info and a web-based html form for input. In my Perl middleware, I'm trying to use HTML::Entities to encode any and all non-alphanumeric characters plus all non-English characters in the user input before building the SQL.
Works OK, except for the "any and all non-alphanumeric characters plus all non-English characters" part. Tried the deault
but, as it says in the documentation
This routine replaces unsafe characters in $string with their entity representation. .... The default set of characters to encode are control chars, high-bit chars, and the <, &, >, ' and " characters.
and that doesn't seem to include a whole bunch of non-alphnum characters like :, ;, , , ^, (, ) and a few more.
So I read:
A second argument can be given to specify which characters to consider unsafe (i.e., which to escape). ... this, for example, would encode just the <, &, >, and " characters:
OK, but I don't want to have to generate a list of every non-English character plus all the non-aplhanumerics - I might as well make my own regex if I have to do that.
So next I tried this, from the example:
But that leave a whole bunch of non-alphanumeric chars as well. So, what the heck, just enlarge the range, right?
converts every single character, alphanumeric and all. But maybe I'm getting closer...
Will appreciate pointers to get me there.
Forget that fear of gravity,
Get a little savagery in your life.