http://www.perlmonks.org?node_id=1198040


in reply to Re: Safe string handling
in thread Safe string handling

This tool has been evolving over the course of several years. Every time I encounter some weirdness that breaks it, I've enhanced it. I recently rewrote it from scratch to incorporate everything I learned along the way. Offhand I can't tell you that there is a single site that has all the weirdness in my "broken on purpose" example, however, I can tell you that I've encountered websites that have mixed things up in ways that they were never intended. At this point, I think my tool handles everything I've ever encountered and is ready for anything that I haven't yet encountered. Even if you've only encountered well-behaved websites, there still is way to tell Perl to give you the sixth UTF-8 character from a string as in the "$snowman" example.

Replies are listed 'Best First'.
Re^3: Safe string handling
by RonW (Parson) on Aug 28, 2017 at 22:08 UTC

    Can you give us URLs to some example websites?

      Betcha the OP is decoding entity references without first decoding utf-8. That would produce the "mixed" encoding he's claiming to see.