http://www.perlmonks.org?node_id=11107200

Darkwing has asked for the wisdom of the Perl Monks concerning the following question:

Dear monks,

i'm maintaining a perl application (a command line tool using XML::LibXML) where users can write a config file containing rules. If present, then this rules are applied to the input (an xml file) and matches are reported.

Simplified example (foo.xml):

<objects> <obj> <id>1</id> <version>2</version> <refs> <id>2</id> <id>5</id> </refs> </obj> <obj> <id>2</id> <version>2</version> </obj> <obj> <id>3</id> <version>2</version> <refs> <id>2</id> <id>4</id> </refs> </obj> <obj> <id>4</id> <version>2</version> </obj> <obj> <id>5</id> <version></version> </obj> </objects>

Each object in this simplified example has an id and a version and my have a refs referencing other objects by one or more id.

Actually, these rules are basically xpath expressions checking nodes of an object, and i use findnodes() from XML::LibXML to evaluate. But now it is required to also check nodes of referenced objects. For example "The versions of the referenced objects should all be the same as the version of the current object". It seems to me that this cannot be done via xpath and XML::LibXML's findnodes - right?

I finally found the following solution, it uses XML::LibXML::XPathContext custom xpath functions:

use strict; use warnings; use XML::LibXML; use XML::LibXML::XPathContext; sub getObjAll { my ($topList, $actNode, $idNodeList, $expression, $value) = @_; my $top = $topList->[0]; $expression =~ s/:1:/'$value'/g; my $failed; foreach my $idNode (@{$idNodeList}) { my $id = $idNode->textContent; my $nodes = $top->findnodes("/objects/obj[id='$id' and $expression +]"); return unless @{$nodes}; } return $idNodeList; }; my $dom = XML::LibXML->load_xml(location => 'foo.xml'); my $xc = XML::LibXML::XPathContext->new($dom); $xc->registerFunction('getObjAll', \&getObjAll); my @nodes = $xc->findnodes("/objects/obj[getObjAll(/, ., ./refs/id, " . "'./version=:1:', ./version)]"); foreach my $node (@nodes) { print $node->toString(1); print "\n"; }

(in my application, the xpath expression in $xc->findnodes(...) would be taken from the user's config file)

It works, but I find it ugly that one must pass the actual node and the root node. Any way to get around this? Are there other possible improvements?

PS: it would not be practical to change to another xml module since my application consists of many, many classes and <XML::LibXML> is heavily used.