Re: Namespaced XML::LibXML XPath query

Replies are listed 'Best First'.
Re^2: Namespaced XML::LibXML XPath query (not a bug) by Aristotle (Chancellor) on Feb 20, 2006 at 00:53 UTC
The behaviour you saw is absolutely correct and not a bug at all. To quote the author of libxml2 from a message aptly titled Re: [xml] XPath and default namespaces (bet you're sick of this by now :) ): You cannot define a default namespace for XPath, period, don't try you can't, the XPath spec does not allow it. This can't work and trying to add it to libxml2 would simply make it non conformant to the spec. In a nutshell forget about using default namespace within XPath expressions, this will never work, you can't ! Google [daniel veillard default namespace xpath] if you want more. As he says, XPath has no notion of a default namespace. `//lastName` in an XPath expression always matches that element in the null namespace, not the default namespace. According to the spec: A QName in the node test is expanded into an expanded-name using the namespace declarations from the expression context. This is the same way expansion is done for element type names in start and end-tags except that the default namespace declared with `xmlns` is not used: if the QName does not have a prefix, then the namespace URI is null (this is the same way attribute names are expanded). In `//sdnList:lastName`, `sdnList` is not a namespace. Only URIs can be namespaces. The stuff in front of the colon is the prefix, and is merely a stand-in for the URI. `<sdnList xmlns="http://tempuri.org/sdnList.xsd">` puts the `sdnList` element (and all its prefix-less descendants) in the `http://tempuri.org/sdnList.xsd` namespace. You have to associate this URI with a prefix, then use the prefix in your expression. This is exactly the approach lestrrat posted: `my $xc = XML::LibXML::XPathContext->new( $doc->documentElement() ); $xc->registerNs( foobar => 'http://tempuri.org/sdnList.xsd' ); my $result = $xc->findvalue( '//foobar:lastName' );` [download] I wrote about this a while ago. Note that the prefix is arbitrary and has nothing to do with what appears in your document. This is as it should be, because the following document means exactly the same as the one you have: `<camel:sdnList xmlns:camel="http://tempuri.org/sdnList.xsd"> <camel:sdnEntry> <camel:lastName>Hello world!</camel:lastName> </camel:sdnEntry> </camel:sdnList>` [download] For that matter, even this means the same: `<camel:sdnList xmlns:camel="http://tempuri.org/sdnList.xsd"> <penguin:sdnEntry xmlns:penguin="http://tempuri.org/sdnList.xsd"> <camel:lastName>Hello world!</camel:lastName> </penguin:sdnEntry> </camel:sdnList>` [download] Or this: `<sdnList xmlns="http://tempuri.org/sdnList.xsd"> <penguin:sdnEntry xmlns:penguin="http://tempuri.org/sdnList.xsd"> <lastName>Hello world!</lastName> </penguin:sdnEntry> </sdnList>` [download] You get the idea. Makeshifts last the longest.	[reply] [d/l] [select]
Re^3: Namespaced XML::LibXML XPath query (not a bug) by jbfamilly (Initiate) on Oct 23, 2008 at 10:42 UTC
Hi Monks, I must be missing something simple. Could you please help me grasp this concept... Take the following example xml: `<aaa xmlns="xmlapi_1.0"> <bbb> <ccc> <d1>blah</d1> <d2>blah</d2> <d3>blah</d3> </ccc> <ccc> <d1>blah</d1> <d2>blah</d2> <d3>blah</d3> </ccc> </bbb> </aaa>` [download] I need to iterate through each <ccc>. I worked out how to get the list of <ccc> nodes and this thread confirms what I did as correct. But now that I have the <ccc> node, how do I get the <dx> properties? I've tried with and without the namespace already defined but still no love. It gets worse, the xml I receive could have <e> nested in <d>.	[reply] [d/l]
Re^4: Namespaced XML::LibXML XPath query (not a bug) by grantm (Parson) on Oct 24, 2008 at 00:52 UTC
OK, here's a standalone example that might help: #!/usr/bin/perl use strict; use warnings; use XML::LibXML; use XML::LibXML::XPathContext; my $parser = XML::LibXML->new(); my $doc = $parser->parse_fh(\*DATA); my $xc = XML::LibXML::XPathContext->new( $doc->documentElement() ) +; $xc->registerNs( xapi => 'xmlapi_1.0' ); foreach my $ccc ($xc->findnodes('//xapi:ccc')) { print "Found a ccc\n"; foreach my $d2 ( $xc->findnodes('./xapi:d2', $ccc) ) { print " d2 element contained: '" . $d2->to_literal . "'\n"; } if(my $animal = $xc->findvalue('./xapi:zoo/xapi:critter', $ccc) ) +{ print " The mystery animal is '$animal'\n"; } } exit; __DATA__ <aaa xmlns="xmlapi_1.0"> <bbb> <ccc> <d1>blah d1a</d1> <d2>blah d2a</d2> <d3>blah d3a</d3> <zoo> <critter>Monkey</critter> </zoo> </ccc> <ccc> <d1>blah d1b</d1> <d2>blah d2b</d2> <d3>blah d3b</d3> <zoo> <critter>Giraffe</critter> </zoo> </ccc> </bbb> </aaa> [download] So the key point is that if you want to match an element that has a namespace (explicit via a prefix or inherited from a parent element) then you must include the namespace when you refer to the element in your XPath expression. When matching a namespace, the only thing that matters is the URI. The prefix used in the source document (if there was one) is irrelevant. The prefix used in your code when you register the namespace URI is also irrelevant. What matters is that your XPath query includes a prefix that has been registered to associate it with the same URI as the namespace declared in the source document.	[reply] [d/l]
Re^5: Namespaced XML::LibXML XPath query (not a bug) by jbfamilly (Initiate) on Oct 24, 2008 at 03:42 UTC
Re^2: Namespaced XML::LibXML XPath query by acid06 (Friar) on Feb 16, 2006 at 14:33 UTC
I've just checked the specifications of how it should be and it is, indeed, a bug. Although I don't know if it's a libxml2 bug or a bug in the Perl bindings to it (i.e. XML::LibXML). Either way, you should report it to the authors. But I don't know if it's still maintained, since the last update happened in 2004. acid06 perl -e "print pack('h*', 16369646), scalar reverse $="	[reply]
Re^3: Namespaced XML::LibXML XPath query by diotalevi (Canon) on Feb 16, 2006 at 15:00 UTC
I reported this to rt.cpan.org as soon as I found that it was a bug. ⠤⠤ ⠙⠊⠕⠞⠁⠇⠑⠧⠊	[reply]


Don't ask to ask, just ask
	PerlMonks