Beefy Boxes and Bandwidth Generously Provided by pair Networks
XP is just a number
 
PerlMonks  

XML Parsing

by gvinu4u (Acolyte)
on Sep 21, 2012 at 12:31 UTC ( #994888=perlquestion: print w/ replies, xml ) Need Help??
gvinu4u has asked for the wisdom of the Perl Monks concerning the following question:

I have a line in XMl and need help in parsing the data, XML Data is as below
<workflows><workflow name="Generic Audio Driver Tests" id="8ff67565-79 +54-4a6c-84b4-6404af4a6357" owneralias="louiscl" featurepath="$\UX Pla +tform\Audio\Driver" jobid="60112"><result starttime="9/5/2012 7:43:37 + PM" executiontime="00:09:29.0354215" executionoutcome="Succeeded" te +stoutcome="Passed" /><testcases total="609" passed="609" failed="0" s +kipped="0" blocked="0" /><tasks><task name="Deploy Packages" id="975e +822e-c4de-4cfd-9748-7d08ff5cfee0"><result starttime="9/5/2012 7:43:37 + PM" executiontime="00:00:23.3509657" executionoutcome="Succeeded" /> +<outputdirectory><localpath>C:\Users\hivinod\TShell\Results\Submissio +n\0905-194337\8ff67565\975e822e\1</localpath><remotepath>\\BDCLAB-WM7 +-10\C$\Users\hivinod\TShell\Results\Submission\0905-194337\8ff67565\9 +75e822e\1</remotepath></outputdirectory><outputs><output filename="Mi +crosoft.Phone.Test.UXPlatform.AVCore.AudioCore.Driver_DeployTest.log" + fullpath="\\BDCLAB-WM7-10\C$\Users\hivinod\TShell\Results\Submission +\0905-194337\8ff67565\975e822e\1\Microsoft.Phone.Test.UXPlatform.AVCo +re.AudioCore.Driver_DeployTest.log" /></outputs></task><task name="Ex +ecute Test Harness" id="9a1f022d-dd77-48f8-b3a3-60d0c70e1594"><comman +dline>TestLauncher -b te.exe -w "C:\data\test\result" -la -a \Data\T +est\bin\gaudit_mc.dll /select:"not((@Name='*KSPROPERTY_PIN_NAME*')or +(@Name='*INVALIDVALUE_AUDIO_MUX_SOURCE')or(@Name='*2_CONNECTION_DATAF +ORMAT')or(@Name='*3_CONNECTION_DATAFORMAT')or(@Name='*CHANNELULONG*VO +LUMELEVEL'))" /enablewttlogging /outputFolder:\Data\Test\Result /logf +ile:results.wtl</commandline><workingdir>\Data</workingdir><result st +arttime="9/5/2012 7:44:00 PM" executiontime="00:09:05.5944478" execut +ionoutcome="Succeeded" testoutcome="Passed" /><testcases total="609" +passed="609" failed="0" skipped="0" blocked="0" /><outputdirectory><l +ocalpath>C:\Users\hivinod\TShell\Results\Submission\0905-194337\8ff67 +565\9a1f022d\1</localpath><remotepath>\\BDCLAB-WM7-10\C$\Users\hivino +d\TShell\Results\Submission\0905-194337\8ff67565\9a1f022d\1</remotepa +th></outputdirectory><outputs><output filename="results.wtl" fullpath +="\\BDCLAB-WM7-10\C$\Users\hivinod\TShell\Results\Submission\0905-194 +337\8ff67565\9a1f022d\1\results.wtl" /></outputs><analyses /></task>< +task name="Cleanup Dependencies" id="23e9c221-d748-49c3-9ee3-c6c3a313 +94b6"><result starttime="9/5/2012 7:53:06 PM" executiontime="00:00:00 +" executionoutcome="Succeeded" /></task></tasks></workflow>
I need to extract the below items WorkflowName= Total="609" Passed="609" Failed="0" Skipped="0" Blocked="0" No sure if to use regex or xml module, I've starter in both in this context, kindly help me in this regard, Thanks Guru

Comment on XML Parsing
Download Code
Re: XML Parsing
by toolic (Bishop) on Sep 21, 2012 at 12:58 UTC
    As formatted, your data is hard to understand. I tried to make it presentable using 2 tools (XML::Tidy and xmllint), but both returned errors.

    Therefore, I will only offer the general advice to use an XML parser. I have been happy with XML::Twig. It has a tutorial, and there are many code examples here at the Monastery (and elsewhere).

Re: XML Parsing
by blue_cowdawg (Monsignor) on Sep 21, 2012 at 13:58 UTC

    AUUGH! My eyes! :-D

    My favorite XML parser is XML::Simple. Having said that I tried parsing this with a quick and dirty script and it would seem there is something very wrong with that XML... or it could be me.

    I attempted to clean up the XML and failed miserably. You might want to review its structure and make sure you have matching closures for all your attributes. Sample error I got:

    could not find ParserDetails.ini in /usr/lib/perl5/site_perl/5.14/XML/ +SAX Name <+passed> does not match NameChar production [Ln: 8, Col: 4859721 +220]


    Peter L. Berghold -- Unix Professional
    Peter -at- Berghold -dot- Net; AOL IM redcowdawg Yahoo IM: blue_cowdawg

      Hello

      His XML doesn't have last '</workflows>' tag.

Re: XML Parsing
by tobyink (Abbot) on Sep 21, 2012 at 14:18 UTC
    use 5.010; use XML::LibXML; my $xml = XML::LibXML->load_xml(IO => \*DATA); foreach my $workflow ($xml->findnodes('//workflow')) { say $workflow->{name}; my ($testcases) = $workflow->findnodes('./testcases'); for (qw/ total passed failed skipped blocked /) { say "\t$_: $testcases->{$_}"; } } __DATA__ <workflows> <workflow name="Generic Audio Driver Tests" id="8ff67565-7954-4a6c +-84b4-6404af4a6357" owneralias="louiscl" featurepath="$\UX Platform\A +udio\Driver" jobid="60112"> <result starttime="9/5/2012 7:43:37 PM" executiontime="00:09:2 +9.0354215" executionoutcome="Succeeded" testoutcome="Passed" /> <testcases total="609" passed="609" failed="0" skipped="0" blo +cked="0" /> <tasks> <task name="Deploy Packages" id="975e822e-c4de-4cfd-9748-7 +d08ff5cfee0"> <result starttime="9/5/2012 7:43:37 PM" executiontime= +"00:00:23.3509657" executionoutcome="Succeeded" /> <outputdirectory> <localpath>C:\Users\hivinod\TShell\Results\Submiss +ion\0905-194337\8ff67565\975e822e\1</localpath> <remotepath>\\BDCLAB-WM7-10\C$\Users\hivinod\TShel +l\Results\Submission\0905-194337\8ff67565\975e822e\1</remotepath> </outputdirectory> <outputs> <output filename="Microsoft.Phone.Test.UXPlatform. +AVCore.AudioCore.Driver_DeployTest.log" fullpath="\\BDCLAB-WM7-10\C$\ +Users\hivinod\TShell\Results\Submission\0905-194337\8ff67565\975e822e +\1\Microsoft.Phone.Test.UXPlatform.AVCore.AudioCore.Driver_DeployTest +.log" /> </outputs> </task> <task name="Execute Test Harness" id="9a1f022d-dd77-48f8-b +3a3-60d0c70e1594"> <commandline>TestLauncher -b te.exe -w "C:\data\test\r +esult" -la -a \Data\Test\bin\gaudit_mc.dll /select:"not((@Name='*KS +PROPERTY_PIN_NAME*')or(@Name='*INVALIDVALUE_AUDIO_MUX_SOURCE')or(@Nam +e='*2_CONNECTION_DATAFORMAT')or(@Name='*3_CONNECTION_DATAFORMAT')or(@ +Name='*CHANNELULONG*VOLUMELEVEL'))" /enablewttlogging /outputFolder:\ +Data\Test\Result /logfile:results.wtl</commandline> <workingdir>\Data</workingdir> <result starttime="9/5/2012 7:44:00 PM" executiontime= +"00:09:05.5944478" executionoutcome="Succeeded" testoutcome="Passed" +/> <testcases total="609" passed="609" failed="0" skipped +="0" blocked="0" /> <outputdirectory> <localpath>C:\Users\hivinod\TShell\Results\Submiss +ion\0905-194337\8ff67565\9a1f022d\1</localpath> <remotepath>\\BDCLAB-WM7-10\C$\Users\hivinod\TShel +l\Results\Submission\0905-194337\8ff67565\9a1f022d\1</remotepath> </outputdirectory> <outputs> <output filename="results.wtl" fullpath="\\BDCLAB- +WM7-10\C$\Users\hivinod\TShell\Results\Submission\0905-194337\8ff6756 +5\9a1f022d\1\results.wtl" /> </outputs> <analyses /> </task> <task name="Cleanup Dependencies" id="23e9c221-d748-49c3-9 +ee3-c6c3a31394b6"> <result starttime="9/5/2012 7:53:06 PM" executiontime= +"00:00:00" executionoutcome="Succeeded" /> </task> </tasks> </workflow> </workflows>
    perl -E'sub Monkey::do{say$_,for@_,do{($monkey=[caller(0)]->[3])=~s{::}{ }and$monkey}}"Monkey say"->Monkey::do'

      With Twig.

      my $twig=XML::Twig->new( twig_handlers =>{ #'//*[total="609" and passed="609"]' =>sub { # i wonder why th +is doesn't work..? '//*[@total="609"]' =>sub { my ($twig,$elt)=@_; last if ( $elt->att("passed") ne "609"); last if ( $elt->att("skipped") ne "0"); last if ( $elt->att("blocked") ne "0"); if ($elt->parent->gi eq 'workflow' ){ $elt->parent->print; } } }, PrettyPrint=>'indented', )->parsefile('xml_without_error.xml');

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://994888]
Approved by Ratazong
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others meditating upon the Monastery: (7)
As of 2015-07-06 22:44 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    The top three priorities of my open tasks are (in descending order of likelihood to be worked on) ...









    Results (84 votes), past polls