http://www.perlmonks.org?node_id=975660

Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

I have XML in the form:

<a:houses> <a:house> <a:sq_footage/> <a:address/> </a:house> <a:house> <a:sq_footage/> <a:address/> </a:house> </a:houses>

I need to strip off the namespace a. The XML::Twig CPAN page talks of mapping namespaces, but I am unclear as to whether this will solve my problem here.

Any insight shared would be appreciated. Thanks!

Replies are listed 'Best First'.
Re: removing namespaces with XML::Twig?
by ikegami (Patriarch) on Jun 11, 2012 at 23:38 UTC

    "a" is not a namespace, it's a prefix.

    Are you trying to strip the prefix without changing the element's namespace so

    <a:houses xmlns:a="..."> <a:house> ... </a:house> ... </a:houses>

    becomes

    <houses xmlns="..."> <house> ... </house> ... </houses>

    Or are you trying to change the namespace of the element to nothing and thus stripping the prefix so

    <a:houses xmlns:a="..."> <a:house> ... </a:house> ... </a:houses>

    becomes

    <houses> <house> ... </house> ... </houses>

    (Either way, can't help you.)

      Thanks for the correction. I'm wanting to strip off the prefix.
        The prefix gets stripped off in both versions, so that doesn't answer anything.
Re: removing namespaces with XML::Twig?
by mirod (Canon) on Jun 12, 2012 at 07:08 UTC

    As mentioned above, you need to set each tag to a non-prefixed version of itself.

    _all_ triggers a handler on each tag, tag gives you the tag name and set_tag sets it:

    #!/usr/bin/perl use strict; use warnings; use XML::Twig; XML::Twig->new( start_tag_handlers => { _all_ => sub { my $tag= $_->ta +g; $tag=~ s{^[^:]*:}{}; $_->set_tag( $tag); } }, keep_spaces => 1, # to keep for +mating ) ->parse( \*DATA) ->print; __DATA__ <a:houses> <a:house> <a:sq_footage/> <a:address/> </a:house> <a:house> <a:sq_footage/> <a:address/> </a:house> </a:houses>
Re: removing namespaces with XML::Twig?
by bitingduck (Chaplain) on Jun 12, 2012 at 03:45 UTC

    I tried the obvious of taking the example for "map_xmlns" and seeing if I could substitute an empty prefix. The result was that it leaves the prefix unchanged:

    #!/usr/bin/perl #very slightly modified example lifted from the docs use strict; use warnings; use XML::Twig; my $t= XML::Twig->new( map_xmlns => {'http://www.w3.org/2000/svg' => " +"}, twig_handlers => { 'svg:circle' => sub { $_->set_att( r => 2 +0) } }, pretty_print => 'indented', ) ->parse( '<doc xmlns:gr="http://www.w3.org/2000/svg" +> <gr:circle cx="10" cy="90" r="10"/> </doc>' ) ->print;

    The output is:

    <doc xmlns:gr="http://www.w3.org/2000/svg"> <gr:circle cx="10" cy="90" r="10"/> </doc>

    If I put the "svg"in the example back in the hash, it works as in the example. If I put a space, I get a prefix that's a space and a colon. XML::LibXML::Element will let you turn off the namespace prefix element by element, but I'm too lazy right now to work through that one. Try poking around through the various XML::LibXML children.

Re: removing namespaces with XML::Twig?
by Anonymous Monk on Jun 12, 2012 at 01:56 UTC