Beefy Boxes and Bandwidth Generously Provided by pair Networks
Pathologically Eclectic Rubbish Lister
 
PerlMonks  

Preserving CDATA Tags When Saving XML

by THuG (Beadle)
on Jul 31, 2008 at 20:55 UTC ( #701522=perlquestion: print w/ replies, xml ) Need Help??
THuG has asked for the wisdom of the Perl Monks concerning the following question:

I was using XML::Simple to make very minor edits to some proprietary XML. It works well enough, but when I use XML::Simple's XMLout to save the XML hash, it strips the CDATA tags, leaving the enclosed text in the XML output.

Before I start down another XML library that may not do what I need it to do, I thought I'd ask for suggestions. Or perhaps someone knows how to get XML::Simple to preserve the CDATA tags.

I need a simple XML parser that I can use to do a quick read in of a small XML document, make a couple of changes, and then output the XML again without stripping the CDATA tags.

p.s. In response to ikegami, I am using XML::Simple's XMLout to turn the hash from XMLin back into XML. I haven't seen a flag for getting it to re-encase the bare text. Mind you, the source only seemed to use CDATA some of the time, not all of the time.

<TABLE>
<ROW>
<SITNAME>
BS_NT_DEF_OPS_SVC_Auto_Crit
</SITNAME>
<TEXT>
<![CDATA[Critical threshold for SVC (Automatic Service not Running)]]>
</TEXT>
<AFFINITIES>
00080000000000000000000000000000#*########F
</AFFINITIES>
<PDT>
<![CDATA[*IF *VALUE NT_Services.Start_Type *EQ Automatic *AND *VALUE NT_Services.Current_State *NE Running]]>
</PDT>
<REEV_DAYS>
0
</REEV_DAYS>
<REEV_TIME>
000300
</REEV_TIME>
<AUTOSTART>
*YES
</AUTOSTART>
<ADVISE>
<![CDATA[*NONE]]>
</ADVISE>
<CMD>
<![CDATA[net start "&NT_Services.Service_Name"]]>
</CMD>
<AUTOSOPT>
YNN
</AUTOSOPT>
<DISTRIBUTION>
BL_NT_DEF,BL_NT_DEF_V
</DISTRIBUTION>
<ALERTLIST>
</ALERTLIST>
<HUB>
</HUB>
<QIBSCOPE>
E
</QIBSCOPE>
<SENDMSGQ>
*NONE
</SENDMSGQ>
<DESTNODE>
</DESTNODE>
<LOCFLAG>
</LOCFLAG>
<LSTCCSID>
en_US
</LSTCCSID>
<LSTDATE>
1080528162416000
</LSTDATE>
<LSTRELEASE>
V100
</LSTRELEASE>
<LSTUSRPRF>
u758725
</LSTUSRPRF>
<NOTIFYARGS>
</NOTIFYARGS>
<NOTIFYOPTS>
</NOTIFYOPTS>
<OBJECTLOCK>
</OBJECTLOCK>
<PRNAMES>
</PRNAMES>
<REFLEXOK>
</REFLEXOK>
<SITINFO>
<![CDATA[COUNT=2;ATOM=NTSERVICE.SRVCNAME;TFWD=Y;SEV=Fatal;TDST=0;~]]>
</SITINFO>
<SOURCE>
</SOURCE>
</ROW>
</TABLE>
For anyone interested, that is the XML output from IBM Tivoli Monitoring version 6.

Comment on Preserving CDATA Tags When Saving XML
Re: Preserving CDATA Tags When Saving XML
by ikegami (Pope) on Jul 31, 2008 at 21:02 UTC
    XML parsers convert XML to text, which entails decoding entities and processing CDATA tags. That's pretty much the definition of a parser. You must convert the text to XML before writing it to another XML file. What are you using to create the XML document? Maybe XML::Twig is more up your alley if you're doing small changes.
Re: Preserving CDATA Tags When Saving XML
by mirod (Canon) on Jul 31, 2008 at 21:12 UTC

    Normally CDATA sections are just an easy way to input text that might have special characters in it. For an XML processors there is absolutely no difference between a CDATA section and the same data encoded using &lt; and its friends.

    At least that's how things should be in a ideal world.

    But yes, in this less-than-perfect world, you can use XML::Twig. It will preserve CDATA sections, it can also preserve attribute order (some software is known to care about that. use the keep_atts_order option) and numerical entities (use the keep_encoding option, with caution). It's not quite as simple to use as XML::Simple, but it should be easy enough to learn for the task you describe.

Re: Preserving CDATA Tags When Saving XML
by pajout (Curate) on Aug 01, 2008 at 15:12 UTC
    If you will not be satisfied with XML::Simple or XML::Twig, you can try XML::Trivial - I have designed it with idea of exact parsing and keeping all what is in the origin document, namely CDATA sections.

    Though XML::Trivial makes just read-only document tree structure, it could be easy to iterate parsed tree and print desired output, or, to hack XML::Trivial::Element::sr method, which serializes any element.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://701522]
Approved by ikegami
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others scrutinizing the Monastery: (18)
As of 2014-09-19 16:48 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    How do you remember the number of days in each month?











    Results (143 votes), past polls