|Perl: the Markov chain saw|
Re: Problem with join'ing utf8 and non-utf8 strings (bug?)by Juerd (Abbot)
|on Jun 17, 2008 at 18:23 UTC||Need Help??|
Hello dear Unicode newbie,
You made one big mistake. Just one, so it's easy to fix. You assumed that you are supposed to look at the SvUTF8 flag, but you're not. It's an internal value, and because it's Perl you're allowed to look at its state. But you really shouldn't, if you want to keep your sanity.
Don't use is_utf8, okay? If you really want to know about internal flags, please use Devel::Peek's Dump function instead. It will print some extra useful internal values too, such as the other flags in Perl like NOK and IOK. For that matter, pretend that the UTF8 flag's name is UOK.
Better yet, pretend that the UTF8 flag does not exist. Perl just picks an encoding for numeric and string values automatically, and only in edge cases (and if you're dealing with internals or XS) you need to know what is going on.
I think it's best if I don't explain what goes on in your code, and if you ignore explanations by others. Trying to understand what's going on internally is a nice exercise for when you know how to write good Unicode capable code, but not before that.
Decode your input, and encode your output. Don't query or set the SvUTF8 flag. Thanks!