<?xml version="1.0" encoding="windows-1252"?>
<node id="421956" title="Re: Using Perl XPath for converting Infopath XML files to Word Documents" created="2005-01-13 08:08:00" updated="2005-04-29 09:50:33">
<type id="11">
note</type>
<author id="9346">
mirod</author>
<data>
<field name="doctext">
&lt;p&gt;A few comments:&lt;/p&gt;
&lt;p&gt;You seem to think that in an XPath expression '//' denotes the top of the tree. It doesn't. The path you should be using is &lt;tt&gt;/Books/Book&lt;/tt&gt;. '//' is more like a wildcard: &lt;tt&gt;//book&lt;/tt&gt; will find all the book nodes in the document. Using '//' in your case forces the XPath engine to test basically all nodes in the document, while &lt;tt&gt;/Books/Book&lt;/tt&gt; is much more efficient, and tests only the root and first-level children. For a good XPath tutorial have a look at &lt;a href="http://zvon.org/xxl/XPathTutorial/General/examples.html"&gt;zvon.org&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;A couple of minor stylistic quibbles: I don't think you need to write &lt;tt&gt;foreach my $book ($xp-&amp;gt;find('/Books/Book')-&amp;gt;get_nodelist)&lt;/tt&gt;, as &lt;tt&gt;find&lt;/tt&gt; in list context will return an array, so you can just write &lt;tt&gt;foreach my $book ($xp-&amp;gt;find('/Books/Book'))&lt;/tt&gt;; you could also replace &lt;tt&gt;$book-&amp;gt;find('author')-&amp;gt;string_value&lt;/tt&gt; by simply &lt;tt&gt;$book-&gt;findvalue('author')&lt;/tt&gt;, which, besides being shorter, brings also the added benefit that it won't die if for some reason the &lt;tt&gt;author&lt;/tt&gt; element is not present.&lt;/p&gt;
&lt;p&gt;Finally, you wrote: &lt;i&gt;the XPath Perl module which is part of the XML module&lt;/i&gt;. Not quite, [cpan://XML::XPath] is a module in the XML namespace, just like [XML::Parser], [XML::Simple] or any other  XML:: module.&lt;/p&gt;
</field>
<field name="root_node">
421936</field>
<field name="parent_node">
421936</field>
</data>
</node>
