Perl: the Markov chain saw | |
PerlMonks |
XML::XPath and character encodingby mboudreau (Acolyte) |
on Dec 29, 2015 at 16:10 UTC ( [id://1151377]=perlquestion: print w/replies, xml ) | Need Help?? |
mboudreau has asked for the wisdom of the Perl Monks concerning the following question: I've got a fairly old script that uses XML::XPath to parse an XML file that includes personal names with accented letters, e.g.,
The names are assigned variables and then inserted into a MySQL database. Later, a different script collects the names from the database and writes them to a new XML file. The problem (which may have been going on for a while and has only just been reported) is that the accented letters aren't surviving the journey. When I view the names in the database (using Sequel Pro 1.1 for OS X), I see the correct characters. But when the names are written to a new XML file, the characters appear as question marks (viewing the XML file in Oxygen XML Editor 17.0 for OS X or just via the Unix 'more' command from the OS X Terminal app). I'm pretty confident that the source XML file is OK. It has the explicit XML declaration (as in the sample above) including 'encoding="UTF-8"', and it displays correctly any way I view it (from the Unix command-line, from Oxygen, etc.). I have also verified that reading the XML file and writing it out again like so:
preserves the characters. The problem occurs when I parse the file content like so:
I confess I haven't had to deal much with character encoding before, so I'd be grateful for any advice on how to troubleshoot this problem.
Back to
Seekers of Perl Wisdom
|
|