Beefy Boxes and Bandwidth Generously Provided by pair Networks
The stupid question is the question not asked
 
PerlMonks  

Parsing and manipulating XML

by hoppfrosch (Scribe)
on Apr 09, 2018 at 07:34 UTC ( #1212563=perlquestion: print w/replies, xml ) Need Help??

hoppfrosch has asked for the wisdom of the Perl Monks concerning the following question:

Hello all,

I'm looking for some code to read and manipulate a XML-node (preferred based on XML::LibXML)

I've got the following given XML-structure from an external project:

<Build-Doc> <Build> text.mak <Targets>all</Targets> <Nmake></Nmake> </Build> </Build-Doc>
Now I want to read and manipulate the text-contents of the <Build> node only - i.e the content "text.mak" - wheras the childnodes <Targets> and <Nmake> should stay untouched. What I tried:
use XML::LibXML; my $parser = XML::LibXML->new(); my $xmldoc = $parser->parse_file( $filepath ); my $value = $xmldoc->findvalue( "/Build-Doc/Build[1]" ); # This returns " text.mak'n all" - which I DON'T want! # What I want to get is " text.mak" only ... # How do I manipulate "text.mak" and leave the rest of my # xml-structure untouched?
Thanks for your ideas!

Replies are listed 'Best First'.
Re: Parsing and manipulating XML
by Your Mother (Bishop) on Apr 09, 2018 at 16:33 UTC

    The cow says, "Mooooooooooo…" :P

    use strictures; use XML::LibXML; my $doc = XML::LibXML->load_xml( IO => \*DATA ); my $text = [ $doc->findnodes('/Build-Doc/Build/text()[1]') ]->[0]; my $value = $text->data; $value =~ s/text/TACOS/; $text->setData($value); print $doc; __END__ <Build-Doc> <Build> text.mak <Targets>all</Targets> <Nmake></Nmake> </Build> </Build-Doc>

    You can work directly with text as XML nodes. Just find the right xpath in your case to it or, if necessary, to iterate over the text items (my example just grabs the first one and puts it in a scalar instead of an array of found nodes). Update: s/list/array/ + missing article.

      Great - exactly what I was looking for! Thanks for your help - greatly appreciated
    A reply falls below the community's threshold of quality. You may see it by logging in.
Re: Parsing and manipulating XML
by hippo (Chancellor) on Apr 09, 2018 at 08:27 UTC

    Here's an SSCCE:

    use strict; use warnings; use Test::More tests => 2; use XML::LibXML; my $doc = <<EOT; <Build-Doc> <Build> text.mak <Targets>all</Targets> <Nmake></Nmake> </Build> </Build-Doc> EOT my $parser = XML::LibXML->new(); my $dom = $parser->load_xml (string => $doc); my @str; push @str, $dom->toString; my $content = $dom->getElementsByTagName ('Build')->pop->firstChild->t +extContent; like ($content, qr/^\s+text.mak\s+$/, 'Text matches'); push @str, $dom->toString; is ($str[0], $str[1], 'DOM untouched');

    Not the most elegant, perhaps, but shows one approach.

Re: Parsing and manipulating XML
by Corion (Pope) on Apr 09, 2018 at 07:50 UTC

    Maybe using the no_blanks => 1 option when constructing your parser already is enough? This will make libxml not parse the (considered significant by XML) whitespace between nodes.

Re: Parsing and manipulating XML
by Anonymous Monk on Apr 09, 2018 at 08:59 UTC
    text() is xpath for text nodes, ie not tags/elements
A reply falls below the community's threshold of quality. You may see it by logging in.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://1212563]
Front-paged by Corion
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others imbibing at the Monastery: (3)
As of 2020-10-24 07:10 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    My favourite web site is:












    Results (242 votes). Check out past polls.

    Notices?