Beefy Boxes and Bandwidth Generously Provided by pair Networks
Pathologically Eclectic Rubbish Lister
 
PerlMonks  

Extraction of value with XMLLIB

by shak (Initiate)
on May 05, 2015 at 14:32 UTC ( #1125740=perlquestion: print w/ replies, xml ) Need Help??
shak has asked for the wisdom of the Perl Monks concerning the following question:

Hi All, I have a Huge XML file I need to extract data below with XMLLIB
OUTPUT
VAL1 ,0,0 VAL2,0,0 VEL3,0,0 VAL4,0,0 VAL5 ,490783914,4532
My code
my $parser = XML::LibXML->new(); my $doc = $parser->parse_file($filename); foreach my $book ($doc->findnodes('/mdc/md/mt[text()="VAL1"]') { $val1=$book->findnodes('./r[1]/text ()'); push (@Val,$val1) }
Input XML File
<p> <mdc xmlns:HTML="http://www.w3.org/TR/REC-xml"> <md> <neid> <neun></neun> <nedn>GET_SUB</nedn> <nesw>R4BA06</nesw> </neid> <mi> <mts>20150429141500Z</mts> <gp>900</gp> <mt>VAL1</mt> <mt>VAL2</mt> <mt>VAL3</mt> <mt>VAL4</mt> <mt>VAL5</mt> <mt>VAL6</mt> <mt>VAL7</mt> <mt>VAL8</mt> <mv> <moid>NAME</moid> <r>0</r> <r>0</r> <r>0</r> <r>0</r> <r>490783914</r> <r>0</r> <r>0</r> <r>0</r> </mv> <mv> <moid>NAME1</moid> <r>0</r> <r>0</r> <r>0</r> <r>0</r> <r>4532</r> <r>0</r> <r>0</r> <r>0</r> </mv> </mi> </md> </mdc>

Comment on Extraction of value with XMLLIB
Select or Download Code
Re: Extraction of value with XMLLIB
by Anonymous Monk on May 05, 2015 at 14:56 UTC
    So what is the problem?
Re: Extraction of value with XMLLIB
by choroba (Canon) on May 05, 2015 at 15:00 UTC
    The basic problem is your XPath expression doesn't follow the structure of the document. <mt> is not a child of <md>, there's a <mi> in between. Therefore, something like the following should match all the VAL nodes:
    /mdc/md/mi/mt[contains(.,"VAL")]

    The same holds for the <r> elements: their parent is <mv.

    The following works for me:

    #!/usr/bin/perl use warnings; use strict; use feature qw{ say }; use XML::LibXML; my @val; my $doc = 'XML::LibXML'->load_xml( location => 'file.xml' ); for my $book ($doc->findnodes('/mdc/md/mi/mt[contains(.,"VAL")]')) { my $order = 1 + $book->findvalue('count(preceding-sibling::mt)'); my $rs = $book->findnodes("../mv/r[$order]"); say join ', ', map $_->textContent, $book, @$rs; }

    You can get the same logic with XML::XSH2, which is a wrapper around XML::LibXML:

    open file.xml ; for /mdc/md/mi/mt[xsh:match(.,'^VAL[0-9]+')] { my $order = 1 + count(preceding-sibling::mt) ; echo xsh:join(', ', (.), ../mv/r[$order]) ; }

    لսႽ ᥲᥒ⚪⟊Ⴙᘓᖇ Ꮅᘓᖇ⎱ Ⴙᥲ𝇋ƙᘓᖇ

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://1125740]
Approved by toolic
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others cooling their heels in the Monastery: (9)
As of 2015-05-05 17:44 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    In my home, the TV remote control is ...









    Results (121 votes), past polls