Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl: the Markov chain saw

Re^2: XML::Twig Text replacement

by mirod (Canon)
on May 01, 2010 at 21:13 UTC ( #837954=note: print w/replies, xml ) Need Help??

in reply to Re: XML::Twig Text replacement
in thread XML::Twig Text replacement

Actually you can handle namespaces in XML::Twig, using the map_xmlns option. I am not sure it's worth doing in this case though (and it might be a good example of why I dislike seemingly gratuitous default namespaces, they just make processing harder while providing exactly 0 added value).

Also, if you use the id => 'Id' option in the new, you can then write $Id= $_->id and $_->set_id( $CsID); which I think is slighty clearer, and has the added benefit, if need be, to let you access an element directly through its id, using the elt_id method.

Replies are listed 'Best First'.
Re^3: XML::Twig Text replacement
by ikegami (Pope) on May 02, 2010 at 01:18 UTC

    Ah good. I don't use XML::Twig, so I don't have a deep knowledge of it.

    If it consistently ignored namespaces when map_xmlns isn't used, it would be a great shortcut despite being non-standard since there is rarely need to deal with namespace conflicts. (The module never claimed them to be a real XPaths.) Unfortunately, it doesn't consistently ignore namespaces.

    On the plus side, it works according to standard when map_xmlns is used. (Well, I'm not sure how namespaces interact with attributes, so I'm simply commenting on elements.)

    use strict; use warnings; use XML::Twig qw( ); my $xml = <<'__EOI__'; <?xml version="1.0" encoding="UTF-8" standalone="no" ?> <root xmlns:foo="uri:foo"> <ele id="a" /> <ele id="b" xmlns="uri:foo"/> <foo:ele id="c" /> </root> __EOI__ { my $seen = ''; my $t = XML::Twig->new( twig_handlers => { 'ele' => sub { $seen .= $_->att('id') }, }, ); $t->parsestring($xml); print("$seen\n"); print($seen eq 'a' ? "Standard\n" : "Not standa +rd\n"); print($seen eq 'a' || $seen eq 'abc' ? "Consistent\n" : "Not consis +tent\n"); } print("\n"); { my $seen_null = ''; my $seen_foo = ''; my $t = XML::Twig->new( map_xmlns => { 'uri:foo' => 'f', }, twig_handlers => { 'ele' => sub { $seen_null .= $_->att('id') || $_->att('f: +id') }, 'f:ele' => sub { $seen_foo .= $_->att('id') || $_->att('f: +id') }, }, ); $t->parsestring($xml); print("$seen_null:$seen_foo\n"); print($seen_null eq 'a' ? "Standard\n" : " +Not standard\n"); print($seen_null eq 'a' || $seen_null eq 'abc' ? "Consistent\n" : " +Not consistent\n"); print($seen_foo eq 'bc' ? "NS working\n" : " +NS broken\n"); }
    ab Not standard Not consistent a:bc Standard Consistent NS working

      There are actually 3 ways of dealing with namespaces, 2 of which are supported by XML::Twig:

      • proper support, your second example, which is correct, but verbose as you have to associated namespace URIs to prefixes; I also suspect that it is not as robust as one would expect, I don't quite trust those URIs to not change sneakily, especially for the default namespace,
      • ignore namespace declarations is XML::Twig's default mode, there foo:ele is the element by that name, and ele is just ele, whether it is assigned a default namespace or not, you think of this as inconsistent,
      • drop all namespaces, which is your first example, there foo:ele is seen as ele; I had never thought of that option, I could add it, but I strongly suspect that it would be one favored only by users who care about namespaces, and already use map_xmlns (or XML::LibXML!),

      For XML::Twig, consistency is not as important as convenience, and I the current behaviour seems to be convenient for most users. I'll look into adding an option to just drop namespaces, it should not be too difficult, but I am not sure how useful it would be.

        There are actually 3 ways of dealing with namespace

        "Drop" and "ignore" mean the same thing. At the very least, the difference is ambiguous. Let's clarify:

        • Standard support (verbose)
        • Ignore namespace declarations (assume everything is in the null namespace)
        • Ignore namespace declarations and make prefixes significant.

        The last is XML::Twig's default. It also supports the first.

        For XML::Twig, consistency is not as important as convenience

        That's exactly the point I was making.

        Prefixes are arbitrary strings that can and do change from document to document. The employment of a default namespace also can and does change from document to document.

        It would very convenient if it could ignore namespaces, but it doesn't have a way of doing that. Instead of just ignoring namespaces, the default is to make make the prefix significant. Not only is that a step in the wrong direction, it makes it unusable in theory and in practice.

        Being unable to use XML::Twig under common circumstances is not convenient at all.

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://837954]
and all is quiet...

How do I use this? | Other CB clients
Other Users?
Others wandering the Monastery: (7)
As of 2018-05-24 18:12 GMT
Find Nodes?
    Voting Booth?