Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl Monk, Perl Meditation
 
PerlMonks  

XML::Twig Text replacement

by Gizmo (Novice)
on Apr 30, 2010 at 17:16 UTC ( #837821=perlquestion: print w/ replies, xml ) Need Help??
Gizmo has asked for the wisdom of the Perl Monks concerning the following question:

I've written two scripts with XML::Simple and XML::DOM to do some replacements they work but with XML::Simple I have a problem with the output being alphabetically sorted which I've read is a limitation. I'm now trying to do the same with XML::Twig but can't seem to figure out how to do it. I've displayed the XML and the Simple code so you can get a better understanding of what I want to achieve. I'm wanting to change the Id= part from Id="/Local/ App/App1" to Id=/App1"

XML snip:
<?xml version="1.0" encoding="UTF-8" standalone="no" ?> <Profile xmlns="xxxxxxxxx" name="" version="1.1" xmlns:xsi="http:// www.w3.org/2001/XMLSchema-instance"> <Application Name="App1" Id="/Local/App/App1" Services="1" pol +icy="" StartApp="" Bal="5" sessInt="500" WaterMark="1.0"/> <AppProfileGuid>586e3456dt</AppProfileGuid> </Profile>
XML Simple Code
use XML::Simple; my $xml = new XML::Simple (ForceArray => 1, KeepRoot => 1,KeyAttr=>[]) +; my $data = $xml->XMLin($xmlfile); my $Id = $data->{Profile}->[0]->{Application}->[0]->{Id}; my $CsID = (split(/\//, $Id))[-1]; $data->{Profile}->[0]->{Application}->[0]->{Id} = $CsID; print $xml->XMLout($data);

Comment on XML::Twig Text replacement
Select or Download Code
Re: XML::Twig Text replacement
by ikegami (Pope) on Apr 30, 2010 at 18:28 UTC

    XML::Twig:

    use strict; use warnings; use XML::Twig qw( ); binmode STDOUT; my $t = XML::Twig->new( twig_handlers => { '/Profile/Application' => sub { my $Id = $_->att('Id'); my $CsID = (split(/\//, $Id))[-1]; $_->set_att(Id => $CsID); }, }, ); $t->parsefile($ARGV[0]); $t->flush();

    XML::LibXML:

    use strict; use warnings; use XML::LibXML qw( ); use XML::LibXML::XPathContext qw( ); my $doc = XML::LibXML->new()->parse_file($ARGV[0]); my $root = $doc->documentElement(); my $xpc = XML::LibXML::XPathContext->new(); $xpc->registerNs(x => 'xxxxxxxxx'); for ($xpc->findnodes('/x:Profile/x:Application', $root)) { my $Id = $_->getAttribute('Id'); my $CsID = (split(/\//, $Id))[-1]; $_->setAttribute(Id => $CsID); } binmode STDOUT; print $doc->toString();

    XML::LibXML is a bit wordier than XML::Twig (the 2 xpc lines) in order to handle namespaces correctly. (XML::Twig doesn't.)

      Thanks a lot, makes sense now. I'll try out the libXML too.

      Actually you can handle namespaces in XML::Twig, using the map_xmlns option. I am not sure it's worth doing in this case though (and it might be a good example of why I dislike seemingly gratuitous default namespaces, they just make processing harder while providing exactly 0 added value).

      Also, if you use the id => 'Id' option in the new, you can then write $Id= $_->id and $_->set_id( $CsID); which I think is slighty clearer, and has the added benefit, if need be, to let you access an element directly through its id, using the elt_id method.

        Ah good. I don't use XML::Twig, so I don't have a deep knowledge of it.

        If it consistently ignored namespaces when map_xmlns isn't used, it would be a great shortcut despite being non-standard since there is rarely need to deal with namespace conflicts. (The module never claimed them to be a real XPaths.) Unfortunately, it doesn't consistently ignore namespaces.

        On the plus side, it works according to standard when map_xmlns is used. (Well, I'm not sure how namespaces interact with attributes, so I'm simply commenting on elements.)

        use strict; use warnings; use XML::Twig qw( ); my $xml = <<'__EOI__'; <?xml version="1.0" encoding="UTF-8" standalone="no" ?> <root xmlns:foo="uri:foo"> <ele id="a" /> <ele id="b" xmlns="uri:foo"/> <foo:ele id="c" /> </root> __EOI__ { my $seen = ''; my $t = XML::Twig->new( twig_handlers => { 'ele' => sub { $seen .= $_->att('id') }, }, ); $t->parsestring($xml); print("$seen\n"); print($seen eq 'a' ? "Standard\n" : "Not standa +rd\n"); print($seen eq 'a' || $seen eq 'abc' ? "Consistent\n" : "Not consis +tent\n"); } print("\n"); { my $seen_null = ''; my $seen_foo = ''; my $t = XML::Twig->new( map_xmlns => { 'uri:foo' => 'f', }, twig_handlers => { 'ele' => sub { $seen_null .= $_->att('id') || $_->att('f: +id') }, 'f:ele' => sub { $seen_foo .= $_->att('id') || $_->att('f: +id') }, }, ); $t->parsestring($xml); print("$seen_null:$seen_foo\n"); print($seen_null eq 'a' ? "Standard\n" : " +Not standard\n"); print($seen_null eq 'a' || $seen_null eq 'abc' ? "Consistent\n" : " +Not consistent\n"); print($seen_foo eq 'bc' ? "NS working\n" : " +NS broken\n"); }
        ab Not standard Not consistent a:bc Standard Consistent NS working
Re: XML::Twig Text replacement
by toolic (Bishop) on Apr 30, 2010 at 18:29 UTC
    use strict; use warnings; use XML::Twig; my $x = <<EOF; <?xml version="1.0" encoding="UTF-8" standalone="no" ?> <Profile xmlns="xxxxxxxxx" name="" version="1.1" xmlns:xsi="http:// www.w3.org/2001/XMLSchema-instance"> <Application Name="App1" Id="/Local/App/App1" Services="1" pol +icy="" StartApp="" Bal="5" sessInt="500" WaterMark="1.0"/> <AppProfileGuid>586e3456dt</AppProfileGuid> </Profile> EOF my $t = XML::Twig->new(twig_handlers => {Application => \&app}); $t->parse($x); $t->print(); sub app { my ($twig, $app) = @_; my $id = $app->att('Id'); $id =~ s{^/Local/App}{}; $app->set_att('Id', $id); }
      That would match Application elements anywhere, whereas the OP would only match Application elements under the root element. It might not make a difference, though.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://837821]
Approved by zwon
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others imbibing at the Monastery: (5)
As of 2015-07-04 02:37 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    The top three priorities of my open tasks are (in descending order of likelihood to be worked on) ...









    Results (57 votes), past polls