Preserving CDATA Tags When Saving XML

by THuG (Beadle)
on Jul 31, 2008 at 20:55 UTC
THuG has asked for the wisdom of the Perl Monks concerning the following question:

I was using XML::Simple to make very minor edits to some proprietary XML. It works well enough, but when I use XML::Simple's XMLout to save the XML hash, it strips the CDATA tags, leaving the enclosed text in the XML output.

Before I start down another XML library that may not do what I need it to do, I thought I'd ask for suggestions. Or perhaps someone knows how to get XML::Simple to preserve the CDATA tags.

I need a simple XML parser that I can use to do a quick read in of a small XML document, make a couple of changes, and then output the XML again without stripping the CDATA tags.

p.s. In response to ikegami, I am using XML::Simple's XMLout to turn the hash from XMLin back into XML. I haven't seen a flag for getting it to re-encase the bare text. Mind you, the source only seemed to use CDATA some of the time, not all of the time.

<![CDATA[Critical threshold for SVC (Automatic Service not Running)]]>
<![CDATA[*IF *VALUE NT_Services.Start_Type *EQ Automatic *AND *VALUE NT_Services.Current_State *NE Running]]>
<![CDATA[net start "&NT_Services.Service_Name"]]>
For anyone interested, that is the XML output from IBM Tivoli Monitoring version 6.

Re: Preserving CDATA Tags When Saving XML
by mirod (Canon) on Jul 31, 2008 at 21:12 UTC

    Normally CDATA sections are just an easy way to input text that might have special characters in it. For an XML processors there is absolutely no difference between a CDATA section and the same data encoded using &lt; and its friends.

    At least that's how things should be in a ideal world.

    But yes, in this less-than-perfect world, you can use XML::Twig. It will preserve CDATA sections, it can also preserve attribute order (some software is known to care about that. use the keep_atts_order option) and numerical entities (use the keep_encoding option, with caution). It's not quite as simple to use as XML::Simple, but it should be easy enough to learn for the task you describe.

Re: Preserving CDATA Tags When Saving XML
by ikegami (Pope) on Jul 31, 2008 at 21:02 UTC
    XML parsers convert XML to text, which entails decoding entities and processing CDATA tags. That's pretty much the definition of a parser. You must convert the text to XML before writing it to another XML file. What are you using to create the XML document? Maybe XML::Twig is more up your alley if you're doing small changes.
Re: Preserving CDATA Tags When Saving XML
by pajout (Curate) on Aug 01, 2008 at 15:12 UTC
    If you will not be satisfied with XML::Simple or XML::Twig, you can try XML::Trivial - I have designed it with idea of exact parsing and keeping all what is in the origin document, namely CDATA sections.

    Though XML::Trivial makes just read-only document tree structure, it could be easy to iterate parsed tree and print desired output, or, to hack XML::Trivial::Element::sr method, which serializes any element.

Node Type: perlquestion
