in reply to Re: XML::Twig and UTF-8
in thread XML::Twig and UTF-8
Thanks for the help. Preprocessing the text with Text::Unidecode certainly does the trick.
I must be misunderstanding something, though. If I don't preprocess, I get a not well-formed (invalid token) at line 2, column 25, byte 68 at C:/Perl_588/lib/XML/Parser.pm line 187 error, which I thought was due to the unicode char in the input, but your comment states that it shouldn't be a problem. What am I doing wrong?
<Text>5CH (the BACKSLASH ý\ý in ISO-IR 6) shall</Text> 01234567890123456789012345 1 2 ^
Updated with results of Text::Unicode and repeated input text from DATA