Beefy Boxes and Bandwidth Generously Provided by pair Networks Bob
more useful options
 
PerlMonks  

Re: LibXML, XPath and Namespaces

by tobyink (Abbot)
on Mar 21, 2013 at 21:32 UTC ( #1024823=note: print w/ replies, xml ) Need Help??


in reply to LibXML, XPath and Namespaces

Can I have my bonus points please??

use v5.10; use strict; use warnings; use XML::LibXML; my $xml = XML::LibXML->load_xml(IO => \*DATA); say "The root element's namespace is: ", $xml->documentElement->namespaceURI; # Give that namespace a prefix so that we can reference it in XPath $xml->documentElement->setNamespaceDeclPrefix("", "gt"); say "Look! The new prefix works! Found: ", $xml->findvalue( '//gt:EnvelopeVersion'); __DATA__ <?xml version="1.0"?> <GovTalkMessage xmlns="http://www.govtalk.gov.uk/CM/envelope"> <EnvelopeVersion>2.0</EnvelopeVersion> <Header> <MessageDetails> ..... </MessageDetails> </Header> <GovTalkDetails> ..... </GovTalkDetails> <Body> <!-- A valid Body payload with a namespace declaration on the first el +ement --> </Body> </GovTalkMessage>
package Cow { use Moo; has name => (is => 'lazy', default => sub { 'Mooington' }) } say Cow->new->name


Comment on Re: LibXML, XPath and Namespaces
Download Code
No bonus points for you :-)
by space_monk (Chaplain) on Mar 22, 2013 at 09:28 UTC

    What I really wanted to achieve was for the system to assume the default namespace was 'gt' so I didn't have to include it in the prefix in all XPath expressions.

    It's fine when it's just one level deep e.g. EnvelopeVersion, but when you want to pick up a number of nodes 3 or 4 levels deep and keep having to repeat that 'gt:' at every level its a PITA.

    I did mod your reply up for the effort though :-)

    A Monk aims to give answers to those who have none, and to learn from those who know more.

      "What I really wanted to achieve was for the system to assume the default namespace was 'gt' so I didn't have to include it in the prefix in all XPath expressions."

      Well, that would break XPath spec compliance. As per the XPath spec, node names with no colon always reference nodes with no namespace at all.

      Otherwise, if you could somehow set "gt" to be the default namespace for XPaths, you wouldn't be able to distinguish between the following two attributes:

      <gt:foo gt:bar="1" bar="2" />

      "It's fine when it's just one level deep e.g. EnvelopeVersion, but when you want to pick up a number of nodes 3 or 4 levels deep and keep having to repeat that 'gt:' at every level its a PITA."

      I enjoy golf as much as the next man, but is three characters per name really so bad? (You could always bind the namespace to just "g" so it was two characters.) I saved you having to construct XML::LibXML::XPathContext objects, didn't I??

      If your XPaths are fairly simple, you could take a look at XML::LibXML::QuerySelector which allows you to select nodes using CSS selectors. I wrote it for use with (X)HTML, but I don't see any reason it shouldn't roughly work with arbitrary XML.

      package Cow { use Moo; has name => (is => 'lazy', default => sub { 'Mooington' }) } say Cow->new->name

        Everything you said is true, but the addition of this adds obscurity to the XPath selectors. Namespace confusion doesn't really exist as the entire document is written in the specified namespace.

        I suppose another dirty way is to strip the xmlns attribute from the document before parsing it... :-)

        Anyway my basic question has been answered in the negative, so I'm going to simply have to live with the "clean" way of doing it....

        A Monk aims to give answers to those who have none, and to learn from those who know more.

      That wouldn't be a valid XPath. XPath "foo" matches child elements named "foo" in the null namespace. There's no way to specify a default namespace for nodetests in an XPath.

      Furthermore, gt is a prefix, not a namespace. http://www.govtalk.gov.uk/CM/envelope is the namespace in this case. gt is completely arbitrary, meaningless.

        Yes, I'm aware the prefix is meaningless. Word it how you like, I wanted some way of making a null/default prefix map to my namespace. I know that XML has its flaws and this seems to be one of them.

        A Monk aims to give answers to those who have none, and to learn from those who know more.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://1024823]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others surveying the Monastery: (13)
As of 2014-04-16 16:25 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    April first is:







    Results (432 votes), past polls