Beefy Boxes and Bandwidth Generously Provided by pair Networks
Problems? Is your data what you think it is?

Re: XML::Parser does not parse the Symbol

by mirod (Canon)
on Jul 11, 2013 at 10:57 UTC ( #1043693=note: print w/replies, xml ) Need Help??

in reply to XML::Parser does not parse the Symbol

What's the error message? Without it I can only guess...

... that maybe the inverted question mark is encoded in extended-ascii (ISO-8859-1). Since you don't specify an encoding in the XML string, it is assumed to be in UTF-8, and you should get an "invalid character" or such error.

If in your real code the string is hard-coded in the program file, then you need to use utf8;.

If you get the data from a file, you need to either add an XML declaration specifying the encoding, pre-process the data to convert it to utf-8 or use the ProtocolEncoding option when you create the XML::Parser object (I would advise against this last solution though, better to keep the info about the encoding of the data with the data than in the code).

Replies are listed 'Best First'.
Re^2: XML::Parser does not parse the Symbol
by gopalr (Priest) on Jul 11, 2013 at 12:17 UTC

    It is working fine if I use ProtocolEncoding

    $parser = new XML::Parser(Style => 'Tree'); $xml = $parser->parse($xml, ProtocolEncoding => 'ISO-8859-1')

    I have one more clarification, If we use ISO-8859-1, will it support to UTF-8 as well ?


      Did you really read the part where I advised you NOT to use ProtocolEncoding?

      If you have to deal with XML, please educate yourself about encodings, it will pay off in the very short term.

        Please correct me if I am wrong.

        You mean that adding the encoding in the xml declaration as follows in xml file:

        <?xml version="1.0" encoding="ISO-8859-1"?>

        Is that you are suggestion?

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://1043693]
[Corion]: Heh. Perl heredocs are nasty - Filter::Simple doesn't know whether the line continued after the heredoc or not. The heredoc payload starts on the next line.
[Corion]: So currently, Filter::Simple mangles your code/line numbers if you use heredocs ;)
[Corion]: Of course, from a certain angle, it doesn't matter if your code line continues after <<FOO, but it would be nice if Filter::Simple / Text::Balanced didn't mangle that...

How do I use this? | Other CB clients
Other Users?
Others meditating upon the Monastery: (6)
As of 2017-01-22 10:24 GMT
Find Nodes?
    Voting Booth?
    Do you watch meteor showers?

    Results (187 votes). Check out past polls.