http://www.perlmonks.org?node_id=94713


in reply to XML::Parser multilanguage support

You need to add the encoding used in the document at the start of it: <?xml version="1.0" encoding="ISO-8859-1" standalone="yes"?>. The entities are then converted to the proper numerical entities.

You may also want to use mirod's XML::Twig, which, in the latest version (3.0) is able to keep the original encoding. And besides this cool feature, I find Twig easier to use than DOM to process XML docs. Here's what your code would look like with Twig:

#!/usr/bin/perl -w use strict; use XML::Twig 3.0; my $parser = new XML::Twig( keep_encoding => 1 ); my $xmlstring=<<"XMLEND"; <?xml version="1.0" encoding="ISO-8859-1"?> <ACTION> <INPUT LABEL="Radio Button"/> <INPUT LABEL="été"/> <RADIO ID="List"> éééàààùùù </RADIO> </ACTION> XMLEND $parser->parse($xmlstring); $parser->print;

Hope this helps!

update: version 3.0 of XML::Twig can be found here

<kbd>--
my $OeufMayo = new PerlMonger::Paris({http => 'paris.mongueurs.net'});</kbd>

Replies are listed 'Best First'.
Re: Re: XML::Parser multilanguage support
by lucdewav (Initiate) on Jul 08, 2001 at 14:11 UTC
    thanks, i also hope your answer will help me;-) I have got another question: I also must parse my xml strings with XSL files. I used to do it with the XML::XSL module ( v0.24). This module is well integrated with the XML::DOM because you can easily parse DOM objects and get a string using the $XSLParser->transform_document($DOMobject,"DOM") method. example:
    #!/usr/bin/perl -w use strict; use XML::XSLT; use XML::DOM; ... sub applyXSLDOM { my $self=shit; my $request=shift; my $xmldom=shift; my $xsldoc="user\.xsl"; eval{ my $xslparser = XML::XSLT->new($xsldoc,"FILE"); $xslparser->transform_document($xmldom,"DOM"); }; if ($@) { return $request->error("XSL Parsing failed: $@"); } return $xslparser->result_string; }
    Would you know another perl XSL parser? Thanks a lot. Luc