Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl: the Markov chain saw
 
PerlMonks  

Re: XML::Smart - undesired decoding of special XML characters

by 1nickt (Abbot)
on Oct 16, 2017 at 11:14 UTC ( #1201437=note: print w/replies, xml ) Need Help??


in reply to XML::Smart - undesired decoding of special XML characters

Hi, the doc for XML::Smart states:

When loading XML data with UTF-8, Perl (5.8+) should make all the work internally.

please provide a short sample of the XML you are attempting to work with.


The way forward always starts with a minimal test.
  • Comment on Re: XML::Smart - undesired decoding of special XML characters

Replies are listed 'Best First'.
Re^2: XML::Smart - undesired decoding of special XML characters
by NeedForPerl (Novice) on Oct 16, 2017 at 12:35 UTC

    Hi, Thanks for the fast reply. For instance i use the following XML file:

    <?xml version="1.0" encoding="UTF-8"?> <log> <logentry revision="12345"> <author>someAuthor</author> <date>2017-10-11T09:32:15.704935Z</date> <msg>This is my SVN message with characters like or &amp;.</msg> </logentry> </log>

    After I execute the following script ...

    use XML::Smart; open(my $fh, "<", "test.xml") or die $!; my $logString; while (<$fh>) { $logString .= $_; } my $test = XML::Smart->new($logString); $test->{log}->{logentry}[0]->{msg}->set_binary('FALSE'); print $test->data();

    ... I get the following result.

    <?xml version="1.0" encoding="UTF-8" ?> <?meta name="GENERATOR" content="XML::Smart/1.78 Perl/5.024001 [MSWin3 +2]" ?> <log> <logentry revision="12345"> <author>someAuthor</author> <date>2017-10-11T09:32:15.704935Z</date> <msg dt:dt="binary.base64">VGhpcyBpcyBteSBTVk4gbWVzc2FnZSB3aXRoIGN +oYXJhY3RlcnMgbGlrZSDkIG9yICYu</msg> </logentry> </log>

    If i call the subroutine data(decode => 1) the msg element contains the decoded message:

    <msg>This is my SVN message with characters like or &.</msg>

    But this output is invalid because "&" is not replaced by an escape sequence. I need an XML file with no Base64 encoding and escaped special XML characters like "&". All in one, a valid XML document without Base64 encoding. The XML parser which parses the output of the script can't handle Base64 encoding. I wonder if can solve the problem by using the subroutine set_binary('FALSE'). PS: Im using Strawberry Perl v5.24.1 on Windows Server 2012 R2 Datacenter.

      NeedForPerl:

      I just checked your code with Your Mother's recommendation and it worked just fine. I did, however, have to remove the symbol because XML::Smart complained about it being encoded incorrectly. After that, and changing 'FALSE' to 0, it gave me (the presumably expected):

      <?xml version="1.0" encoding="UTF-8" ?> <?meta name="GENERATOR" content="XML::Smart/1.78 Perl/5.022004 [cygwin +]" ?> <log> <logentry revision="12345"> <author>someAuthor</author> <date>2017-10-11T09:32:15.704935Z</date> <msg>This is my SVN message with characters like or &amp;.</msg> </logentry> </log>

      I could easily have something munged in my various Windows/Cygwin/vim settings to have messed up the '', but I'm mentioning it just in case you need to know of it.

      ...roboticus

      When your only tool is a hammer, all problems look like your thumb.

        Thank you. Everything works fine. I have no problems with the "".

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://1201437]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others meditating upon the Monastery: (7)
As of 2019-02-19 14:59 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    I use postfix dereferencing ...









    Results (104 votes). Check out past polls.

    Notices?
    • (Sep 10, 2018 at 22:53 UTC) Welcome new users!