Beefy Boxes and Bandwidth Generously Provided by pair Networks
P is for Practical

How can I replace a line (tag) in an XML file?

by perlPractioner (Novice)
on Sep 29, 2011 at 00:47 UTC ( #928451=perlquestion: print w/replies, xml ) Need Help??
perlPractioner has asked for the wisdom of the Perl Monks concerning the following question:

Hi Monks, I need to replace a tag in an XML file that resembles the following:

<students> <student> <name>John</name> <id>001</id> <gpa>A</gpa> </student> <student> <name>John</name> <id>002</id> <gpa>C</gpa> </student> </students>

How would I go about finding the second instance of John (whose id is '002') and replace his GPA of 'C' with a 'B'? Keep in mind that searching for a 'C' within the entire file and replacing it with the 'B' might not be optimal because many students might have a GPA of 'C'. Since the id tag is unique for each student, is there a way to search for that unique tag, then replace the following line (in this case it would be the gpa tag) with a new line containing the 'B' GPA? Any thoughts on this would be greatly appreciated. Thanks in advance!

Replies are listed 'Best First'.
Re: How can I replace a line (tag) in an XML file?
by toolic (Bishop) on Sep 29, 2011 at 01:10 UTC
    Using an XML parser, such as XML::Twig is a good approach:
    use warnings; use strict; use XML::Twig; my $str = <<EOF; <students> <student> <name>John</name> <id>001</id> <gpa>A</gpa> </student> <student> <name>John</name> <id>002</id> <gpa>C</gpa> </student> </students> EOF my $t = XML::Twig->new( twig_handlers => { student => \&student, }, pretty_print => 'indented', ); $t->parse($str); $t->print($str); print "\n"; sub student { my ($t, $elt) = @_; if ($elt->field('id') eq '002') { $elt->first_child('gpa')->set_text('B'); } } __END__ <students> <student> <name>John</name> <id>001</id> <gpa>A</gpa> </student> <student> <name>John</name> <id>002</id> <gpa>B</gpa> </student> </students>
    Note: I used XML::Tidy to indent your original XML code.

    Update: made sub more succinct.

      You can even put the test on the content of id in the handler condition, and not in the sub:

      twig_handlers => { 'student[string(id)="002"]' => sub { $_->first_child('gpa')->set_text('B'); } }

      Note that if the id is in a variable, then quote delimiter collisions between Perl and XPath makes it a little more annoying. Luckily Perl allows us to use whatever character we want with qq{}:

      twig_handlers => { qq{student[string(id)="$id"]} => sub { $_->first_child('gpa')->set_text('B'); } }

      Thanks for the timely response, I really appreciate it. I tested your code and it worked. However, I'd like to read the XML from a .xml file and replace the GPA tag (which you've already done). When I tried the following, I was able to read from the .xml file and replace the GPA tag, however the changes did not write back to the .xml file. I was only able to print the desired output but not able to write it back to the .xml file. This is what I have so far.

      use warnings; use strict; use XML::Twig; open File, 'test.xml'; sysread(FILE, my $str, -s FILE); my $t = XML::Twig->new( twig_handlers => { student => \&student, }, pretty_print => 'indented', ); $t->parse($str); $t->print($str); print "\n"; close FILE; sub student { my ($t, $elt) = @_; if ($elt->field('id') eq '002') { $elt->first_child('gpa')->set_text('B'); } }

      The print $str returns the desired output (changes the second instance of John's GPA from 'C' to 'B') but I'm not sure how to write this output back to the test.xml file. Any suggestions on this?

        You update the initial file the same way you would update a text file, or any other type of file: in this case once you've read the file close it, then open in for writing ans print the output to this filehandle.

        The way you use open, the fact that you use sysread are kind of unusual, so either you have a very personal style or you have little Perl experience and you are cargo-culting your way through the problem. You may want to read a little bit about Perl if you need to use it properly. Invest in "Learning Perl", or read the docs (at least the part on open and print, type perldoc -f open to get them

Re: How can I replace a line (tag) in an XML file?
by Jenda (Abbot) on Sep 29, 2011 at 09:58 UTC
    use strict; use XML::Rules; my $parser = XML::Rules->new( style => 'filter', rules => { _default => 'raw', 'id,gpa' => 'raw extended', student => sub { my ($tag, $attrs, $parser) = @_[0,1,4]; my $id = $attrs->{':id'}{_content}; if (exists $parser->{parameters}{ $id }) { $attrs->{':gpa'}{_content} = $parser->{parameters}{ $i +d }; } return $tag => $attrs; } } ); $parser->filter( \*DATA, \*STDOUT, {'002' => 'B'}); __DATA__ <students> <student> <name>John</name> <id>001</id> <gpa>A</gpa> </student> <student> <name>John</name> <id>002</id> <gpa>C</gpa> </student> </students>

    As with XML::Twig only the data of one student are kept in memory so this works even for huge files. If you needed to change the grades of several students, you can do it in one pass ... you just specify all the IDs and marks in the ->filter() call.

    Enoch was right!
    Enjoy the last years of Rome.

Re: How can I replace a line (tag) in an XML file?
by choroba (Bishop) on Sep 29, 2011 at 08:23 UTC
    I usually use XML::XSH2 for XML manipulation. In this case, this script would make it:
    open test.xml ; set /students/student[id="002"]/gpa "B" ; save :b ;
Re: How can I replace a line (tag) in an XML file?
by perlfan (Curate) on Sep 29, 2011 at 04:17 UTC
    This sounds like a good application for XML::XPath. Most of my XML parsing experience has been with XML::Simple's event based parsing, though.

      Apparently it bears repeating: pretty much nothing is "a good application for XML::XPath" at this point in the life of the module (it hasn't been updated in over 8 years). In the DOM/XPath modules category, XML::LibXML is the recommended solution.

Re: How can I replace a line (tag) in an XML file?
by andal (Hermit) on Sep 29, 2011 at 07:58 UTC

    One can also do it without any XML parser. Here's an example

    my $str = <<EOF; <students> <student> <name>John</name> <id>001</id> <gpa>A</gpa> </student> <student> <name>John</name> <id>002</id> <gpa>C</gpa> </student> </students> EOF $str =~ s/<student>(.*?)<\/student>/check_record($1)/gse; print $str; sub check_record { my $rec = shift; $rec =~ s/(<gpa>\s*)C(\s*<\/gpa>)/$1B$2/ if($rec =~ /<id>\s*002\s* +<\/id>/); return "<student>$rec</student>"; }
Re: How can I replace a line (tag) in an XML file?
by AlexTape (Monk) on Sep 29, 2011 at 10:00 UTC
    is the view of the xml fix or is it only your thinkin of storing the data on this way?
    use strict; use XML::Simple; $filename = "yours.xml"; xml_edit($filename); # return done 1 || error 0 sub xml_edit($){ my$bool=0; %DB=load($_[0]); if(exists$DB{student}{id}{2}) { $DB{student}{gpa}='C'; $bool++; } save($_[0],\%DB); return$bool; } #untested
    but i recommend you to build a hash of your data within perl and export it via "save($filename,\%hashref);"

    for this you got truly the right structure with easy editing potential even if the data gets more bulky.
    $perlig =~ s/pec/cep/g if 'errors expected';

      The problem with XML::Simple (well, one of) when using it to edit a file is that it tends to change the XML in ways that do not make a difference when the file is read by XML::Simple, but will cause all other tools to fail to read it.

      In this particular case all those content-only tags will become attributes of the parent tag. XML::Simple will not mind, other tools most probably will.

      Not speaking about the fact that your code would not work ... there is no load() or save() in XML::Simple and the data structure would look different than what you seem to think.

      Enoch was right!
      Enjoy the last years of Rome.

        im sorry - sunken in my own projects i should probably post full storys ;-)
        sub load{return%{XML::Simple->new(ForceContent=>1)->XMLin($_[0])};}#lo +ads xml sub save{XMLout($_[1],OutputFile=>$_[0]);1;}#saves hash to xml
        but your still mostly right.. for me it works fine..
        $perlig =~ s/pec/cep/g if 'errors expected';

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://928451]
Front-paged by keszler
and all is quiet...

How do I use this? | Other CB clients
Other Users?
Others perusing the Monastery: (8)
As of 2018-06-18 20:12 GMT
Find Nodes?
    Voting Booth?
    Should cpanminus be part of the standard Perl release?

    Results (110 votes). Check out past polls.