Beefy Boxes and Bandwidth Generously Provided by pair Networks
Just another Perl shrine
 
PerlMonks  

XML::Parser does not parse the Symbol

by gopalr (Priest)
on Jul 11, 2013 at 10:49 UTC ( #1043689=perlquestion: print w/ replies, xml ) Need Help??
gopalr has asked for the wisdom of the Perl Monks concerning the following question:

Hi Monks

The following Symbol does not parsing in XML::Parse and its not validated, eventhough I use CDATA.

(¿)

$parser="<symbols><![CDATA[Testing for Symbol ]]><symbols>"

$parser = new XML::Parser(Style => 'Tree');

Can you please provide your guidance ?

Thanks in Advance!!

Gopal R

Comment on XML::Parser does not parse the Symbol
Select or Download Code
Re: XML::Parser does not parse the Symbol
by choroba (Abbot) on Jul 11, 2013 at 10:55 UTC
    Works for me. I had to tweak your code a bit to make it run:
    #!/usr/bin/perl use warnings; use strict; use utf8; use XML::Parser; my $xml = "<symbols><![CDATA[Testing for Symbol ]]></symbols>"; my $parser = 'XML::Parser'->new; $parser->parse($xml);
    لսႽ ᥲᥒ⚪⟊Ⴙᘓᖇ Ꮅᘓᖇ⎱ Ⴙᥲ𝇋ƙᘓᖇ
Re: XML::Parser does not parse the Symbol
by mirod (Canon) on Jul 11, 2013 at 10:57 UTC

    What's the error message? Without it I can only guess...

    ... that maybe the inverted question mark is encoded in extended-ascii (ISO-8859-1). Since you don't specify an encoding in the XML string, it is assumed to be in UTF-8, and you should get an "invalid character" or such error.

    If in your real code the string is hard-coded in the program file, then you need to use utf8;.

    If you get the data from a file, you need to either add an XML declaration specifying the encoding, pre-process the data to convert it to utf-8 or use the ProtocolEncoding option when you create the XML::Parser object (I would advise against this last solution though, better to keep the info about the encoding of the data with the data than in the code).

      It is working fine if I use ProtocolEncoding

      $parser = new XML::Parser(Style => 'Tree'); $xml = $parser->parse($xml, ProtocolEncoding => 'ISO-8859-1')

      I have one more clarification, If we use ISO-8859-1, will it support to UTF-8 as well ?

        no

        Did you really read the part where I advised you NOT to use ProtocolEncoding?

        If you have to deal with XML, please educate yourself about encodings, it will pay off in the very short term.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://1043689]
Approved by marto
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others surveying the Monastery: (6)
As of 2014-12-27 08:53 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    Is guessing a good strategy for surviving in the IT business?





    Results (176 votes), past polls