Beefy Boxes and Bandwidth Generously Provided by pair Networks
good chemistry is complicated,
and a little bit messy -LW

Group XML

by ymchaitu (Initiate)
on Jun 07, 2011 at 11:46 UTC ( #908451=perlquestion: print w/replies, xml ) Need Help??
ymchaitu has asked for the wisdom of the Perl Monks concerning the following question:

Hai I have an xml which of this format
<?xml version="1.0" encoding="UTF-8"?> <group> <article-ref rid="doi-ANTI_1995-63-2_001-1"/> <group> <title>America</title> <group> <title>Pakistan</title> <group> <title>India</title> <article-ref rid="doi-ANTI_1995-63-2_001-2"/> </group> </group> </group> <article-ref rid="doi-ANTI_1995-63-2_001-3"/> <group> <title>SYMPOSIUM: POST-CHICAGO ECONOMICS</title> <article-ref rid="doi-ANTI_1995-63-2_001-4"/> <article-ref rid="doi-ANTI_1995-63-2_001-5"/> <article-ref rid="doi-ANTI_1995-63-2_001-6"/> <article-ref rid="doi-ANTI_1995-63-2_001-7"/> <article-ref rid="doi-ANTI_1995-63-2_001-8"/> <article-ref rid="doi-ANTI_1995-63-2_001-9"/> <article-ref rid="doi-ANTI_1995-63-2_001-10"/> <article-ref rid="doi-ANTI_1995-63-2_001-11"/> <article-ref rid="doi-ANTI_1995-63-2_001-12"/> <article-ref rid="doi-ANTI_1995-63-2_001-13"/> </group> <article-ref rid="doi-ANTI_1995-63-2_001-14"/> </group>
This xml need be converted to the following output format
</group> <group> <group> <title><![CDATA[America]]></title><group> <title><![CDATA[Pakistan]]></title><group> <title><![CDATA[India]]></title> <article-ref rid="001-2"/> </group> </group> </group> </group> <group> <article-ref rid="001-3"/> </group> <group> <group> <title><![CDATA[SYMPOSIUM: POST-CHICAGO ECONOMICS]]></title> <article-ref rid="001-4"/> <article-ref rid="001-5"/> <article-ref rid="001-6"/> <article-ref rid="001-7"/> <article-ref rid="001-8"/> <article-ref rid="001-9"/> <article-ref rid="001-10"/> <article-ref rid="001-11"/> <article-ref rid="001-12"/> <article-ref rid="001-13"/> </group> </group> <group> <article-ref rid="001-14"/> </group>
I could not able to get an idea where to start from. so can any one kindly suggest me what to do on this.

Replies are listed 'Best First'.
Re: Group XML
by toolic (Bishop) on Jun 07, 2011 at 12:43 UTC
    Consider using an XML parser, such as XML::Twig. Read the documentation, work through the tutorial, write some code, and if you still have problems, post specific questions here.
      Talk about doing things the hard way ;p its a simple XSLT transform

        Oh, a candidate for a link to Just use an XSLT stylesheet! Would you care to elaborate and show us that "simple transformation"? Thanks.

Re: Group XML
by choroba (Chancellor) on Jun 07, 2011 at 15:24 UTC
    I do not understand exactly what should be wrapped into groups (the beginning of the output is missing, anyway). To insert CDATA and change rid attributes, you can use XML::XSH2 in this way:
    open 908451.xml ; for $t in //title/text() insert cdata $t replace $t ; for //article-ref/@rid insert text xsh:subst(., '.*-2_' , '') replace . ;
    The rest would be also possible if you can explain the algorithm.

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://908451]
Approved by toolic
and all is quiet...

How do I use this? | Other CB clients
Other Users?
Others studying the Monastery: (4)
As of 2017-04-24 04:48 GMT
Find Nodes?
    Voting Booth?
    I'm a fool:

    Results (433 votes). Check out past polls.