Beefy Boxes and Bandwidth Generously Provided by pair Networks
Your skill will accomplish
what the force of many cannot
 
PerlMonks  

Update nodes in XML document

by corfuitl (Sexton)
on Mar 26, 2018 at 15:32 UTC ( #1211760=perlquestion: print w/replies, xml ) Need Help??

corfuitl has asked for the wisdom of the Perl Monks concerning the following question:

Hi all

I have the XML document below and I would like to update some nodes according to a list.

Here is the XLM document:

.... <unit id="3Ojs1SLwMmEHK8mJ0_dc2:43> <titleEN>{3&gt;19&lt;3}</ titleEN > <titleFR>{3&gt;19&lt;3}</ titleFR > <alt-title origin="tilte"><otherTitle/></ alt-title> <alt-title origin="price"><otherPrice/></ alt-title> </unit> ....

and here is the code:

use XML::LibXML; my $parser =XML::LibXML->new(); my $tree =$parser->parse_file($xml); my $root =$tree->getDocumentElement; my ($application_id_node) = $root->findnodes('//file/body/group/unit/a +lt-title/otherTitle/text()'); $application_id_node->removeChildNodes(); $application_id_node->appendText('new value');

How is it possible to update the below node?

<alt-title origin="tilte"><otherTitle/></ alt-title>

so that it should be

<alt-title origin="tilte"><otherTitle>new value<otherTitle></ alt-ti +tle>

Replies are listed 'Best First'.
Re: Update nodes in XML document
by choroba (Archbishop) on Mar 26, 2018 at 15:52 UTC
    If an element is empty, it has no text() child nodes, so you can't call any methods on any of them. But even if it had, you don't want to remove the child nodes of the text(), but the child nodes of their parent, i.e. the otherTitle.

    Note how the following script is SSCCE (which your question wasn't, which made it harder for us to help you):

    #!/usr/bin/perl use warnings; use strict; use feature qw{ say }; use XML::LibXML; for my $other_title ( '<alt-title origin="tilte"><otherTitle/></alt-title>', '<alt-title origin="tilte"><otherTitle>old value</otherTitle></alt +-title>', ) { my $xml = qq( <file> <body> <group> <unit id="3Ojs1SLwMmEHK8mJ0_dc2:43"> $other_title </unit> </group> </body> </file> ); my $dom = 'XML::LibXML'->load_xml(string => $xml); my ($other_title) = $dom->findnodes('//file/body/group/unit/alt-ti +tle/otherTitle'); if ($other_title->findnodes('text()')) { $other_title->removeChildNodes; } $other_title->appendText('new value'); print $dom; }
    ($q=q:Sq=~/;[c](.)(.)/;chr(-||-|5+lengthSq)`"S|oS2"`map{chr |+ord }map{substrSq`S_+|`|}3E|-|`7**2-3:)=~y+S|`+$1,++print+eval$q,q,a,

      Thank you for your reply.

      I tested and it works, however, I am not able to adapt it so that the XML will be loaded from STDIN. I am sorry, but I don't have strong perl skills :(

      Then, the new values will be taken from another file which will contain the same number of lines as the nodes.

      I would appreciate if you could help me or explain better your code.

      Last, is this code update only the alt-title with origin="tilte"?

      Thanks

        > the XML will be loaded from STDIN. I am sorry, but I don't have strong perl skills :(

        Perl skills are irrelevant here. You just need to read the documentation.

        my $dom = 'XML::LibXML'->load_xml(IO => *STDIN);

        > update only the alt-title with origin="tilte"

        Again, not a Perl skills question. findnodes uses XPath Expressions, so just modify it:

        my ($other_title) = $dom->findnodes('//file/body/group/unit/alt-title/ +otherTitle[@origin="tilte"]');
        Is it "tilte" or "title"?

        > the new values will be taken from another file which will contain the same number of lines as the nodes.

        Do you know how to read lines from a file? See open and the diamond operator in perlop. For example, I created an input file called 1 with the following contents:

        a b c

        Then, I created a simple XML file called 1.xml:

        <root> <one> <two origin="a"/> <two origin="b"/> </one> <one> <two origin="b"/> <two origin="a"/> </one> <one> <two origin="a">old</two> <two origin="b">old</two> </one> </root>

        Finally, here's the script that replaces the text in two with origin="a" by the values read from the given file. $. is a special variable that contains the input's line number, you can use it to index the nodes in the XPath Expression:

        #!/usr/bin/perl use warnings; use strict; use XML::LibXML; my $new_values_file = shift; my $dom = 'XML::LibXML'->load_xml(IO => *STDIN); open my $in, '<', $new_values_file or die "$new_values_file: $!"; while (<$in>) { chomp; my ($two) = $dom->findnodes("/root/one[$.]/two[\@origin='a']"); $two->removeChildNodes if $two->findnodes('text()'); $two->appendText($_); } print $dom;

        Run as

        perl script.pl 1 < 1.xml

        ($q=q:Sq=~/;[c](.)(.)/;chr(-||-|5+lengthSq)`"S|oS2"`map{chr |+ord }map{substrSq`S_+|`|}3E|-|`7**2-3:)=~y+S|`+$1,++print+eval$q,q,a,
A reply falls below the community's threshold of quality. You may see it by logging in.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://1211760]
Approved by Corion
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others rifling through the Monastery: (6)
As of 2020-10-28 09:33 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    My favourite web site is:












    Results (260 votes). Check out past polls.

    Notices?