http://www.perlmonks.org?node_id=94713


in reply to XML::Parser multilanguage support

You need to add the encoding used in the document at the start of it: <?xml version="1.0" encoding="ISO-8859-1" standalone="yes"?>. The entities are then converted to the proper numerical entities.

You may also want to use mirod's XML::Twig, which, in the latest version (3.0) is able to keep the original encoding. And besides this cool feature, I find Twig easier to use than DOM to process XML docs. Here's what your code would look like with Twig:

#!/usr/bin/perl -w use strict; use XML::Twig 3.0; my $parser = new XML::Twig( keep_encoding => 1 ); my $xmlstring=<<"XMLEND"; <?xml version="1.0" encoding="ISO-8859-1"?> <ACTION> <INPUT LABEL="Radio Button"/> <INPUT LABEL="été"/> <RADIO ID="List"> éééàààùùù </RADIO> </ACTION> XMLEND $parser->parse($xmlstring); $parser->print;

Hope this helps!

update: version 3.0 of XML::Twig can be found here

<kbd>--
my $OeufMayo = new PerlMonger::Paris({http => 'paris.mongueurs.net'});</kbd>