Beefy Boxes and Bandwidth Generously Provided by pair Networks Cowboy Neal with Hat
Perl: the Markov chain saw
 
PerlMonks  

Re^2: XML parsing and Lists

by madbee (Acolyte)
on Jul 05, 2013 at 00:23 UTC ( #1042553=note: print w/ replies, xml ) Need Help??


in reply to Re: XML parsing and Lists
in thread XML parsing and Lists

Thanks for responding.I was not aware of XPathContext module in LibXML. I was trying to use XML::XPath directly and got into an infinite loop of installation issues which I could not get past.

I will try using this approach. Basically, I have to create a array of nodes for the path:

$parser = XML::LibXML->new; $dom = $parser->parse_file($file); $root = $dom->getDocumentElement; $dom->setDocumentElement($root); my $xc = XML::LibXML::XPathContext->new($file); my @nodes=$xc->findnodes('//Article//Part//Sect//H5[ contains(.,"I +nclude")]',$dom); if (@nodes) { $count = $xc->findvalue('count(//Article//Part//Sect//LI)',$dom); print $count; }

Am I on the right track? Anything I'm missing?

Thanks much.


Comment on Re^2: XML parsing and Lists
Download Code
Re^3: XML parsing and Lists
by choroba (Abbot) on Jul 05, 2013 at 00:50 UTC
    You are overcomplicating the problem. Do not use setDocumentElement, it creates a new root element. The constructor of XPathContext takes a context node as a parameter, not a file. This is a Short, Self Contained, Correct Example:
    #!/usr/bin/perl use warnings; use strict; use XML::LibXML; my $dom = XML::LibXML->load_xml(string => << '__XML__'); <Article> <!-- fixed typo --> <Main> <Sect> <H4>Include</H4> ..... <P1> This is the criteria</P1> <L> <LI> <LI_Label>1.</LI_Label> <LI_Title>Critera 1</LI_Title> </LI> <LI> <LI_Label>2.</LI_Label> <LI_Title>Critera 2</LI_Title> </LI> <LI> <LI_Label>3.</LI_Label> <LI_Title>Critera 3</LI_Title> </LI> <LI> <LI_Label>4.</LI_Label> <LI_Title>Critera 3</LI_Title> </LI> </L> <!-- fixed missing closing tag --> </Sect> </Main> </Article> __XML__ my $xc = XML::LibXML::XPathContext->new; my $count = $xc->findvalue('count(//Article//Sect//LI)', $dom); print "$count list nodes found.\n" if $count;
    لսႽ ᥲᥒ⚪⟊Ⴙᘓᖇ Ꮅᘓᖇ⎱ Ⴙᥲ𝇋ƙᘓᖇ

      well, you're not really using xpathcontext for anything, so it isn't required , this works

      $dom->find('count(//Article//Sect//LI)' );

      Yes, that was an error. I dont need to cretae a new root. However, I do need to search for the "Include" node since the document can have multiple H5 nodes.

      Speaking of document, the XML structure is not consistent across documents. In this, my List is in H5. But in another it can be anywhere else. So, it give a range of paths, could you please let me know if this is correct?

      Thanks again for your help. Really appreciate your time

      my $xc = XML::LibXML::XPathContext->new; my $count = $xc->findvalue('count(//Article//Sect//LI|//Article//Sect/ +/Part//LI|//Article//Part//Li)', $dom); print "$count list nodes found.\n" if $count;

      Hello! Tried the xpathcontext approach and works great. However, I do need to check for the condition: "Where H4 contains Include". The document can have multiple sections as above and I only need to count the list elements of this particular section only.

      This is the expression I am trying, which I know is wrong since this is now looking under H4. I am not sure if its even possible to combine the two conditions at all in one expression. So looking for some help here.

      objective:counting the number of LI under //Article//Main//Sect where value of H4 contains "include"

      $count = $dom->findvalue("count(//Article//Main//Sect//H4[contains(.,\ +"Include\")]/LI)"); print $count;

      greatly appreciate any help in this regard. Thanks!

        LI is not part of the H4. Move H4 into the condition:
        'count(//Article//Sect[contains(H4,"Include")]//LI)'
        لսႽ ᥲᥒ⚪⟊Ⴙᘓᖇ Ꮅᘓᖇ⎱ Ⴙᥲ𝇋ƙᘓᖇ

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://1042553]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others lurking in the Monastery: (11)
As of 2014-04-17 15:31 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    April first is:







    Results (451 votes), past polls