Beefy Boxes and Bandwidth Generously Provided by pair Networks
Think about Loose Coupling

XML Parsing

by gvinu4u (Acolyte)
on Sep 21, 2012 at 12:31 UTC ( #994888=perlquestion: print w/replies, xml ) Need Help??
gvinu4u has asked for the wisdom of the Perl Monks concerning the following question:

I have a line in XMl and need help in parsing the data, XML Data is as below
<workflows><workflow name="Generic Audio Driver Tests" id="8ff67565-79 +54-4a6c-84b4-6404af4a6357" owneralias="louiscl" featurepath="$\UX Pla +tform\Audio\Driver" jobid="60112"><result starttime="9/5/2012 7:43:37 + PM" executiontime="00:09:29.0354215" executionoutcome="Succeeded" te +stoutcome="Passed" /><testcases total="609" passed="609" failed="0" s +kipped="0" blocked="0" /><tasks><task name="Deploy Packages" id="975e +822e-c4de-4cfd-9748-7d08ff5cfee0"><result starttime="9/5/2012 7:43:37 + PM" executiontime="00:00:23.3509657" executionoutcome="Succeeded" /> +<outputdirectory><localpath>C:\Users\hivinod\TShell\Results\Submissio +n\0905-194337\8ff67565\975e822e\1</localpath><remotepath>\\BDCLAB-WM7 +-10\C$\Users\hivinod\TShell\Results\Submission\0905-194337\8ff67565\9 +75e822e\1</remotepath></outputdirectory><outputs><output filename="Mi +crosoft.Phone.Test.UXPlatform.AVCore.AudioCore.Driver_DeployTest.log" + fullpath="\\BDCLAB-WM7-10\C$\Users\hivinod\TShell\Results\Submission +\0905-194337\8ff67565\975e822e\1\Microsoft.Phone.Test.UXPlatform.AVCo +re.AudioCore.Driver_DeployTest.log" /></outputs></task><task name="Ex +ecute Test Harness" id="9a1f022d-dd77-48f8-b3a3-60d0c70e1594"><comman +dline>TestLauncher -b te.exe -w "C:\data\test\result" -la -a \Data\T +est\bin\gaudit_mc.dll /select:"not((@Name='*KSPROPERTY_PIN_NAME*')or +(@Name='*INVALIDVALUE_AUDIO_MUX_SOURCE')or(@Name='*2_CONNECTION_DATAF +ORMAT')or(@Name='*3_CONNECTION_DATAFORMAT')or(@Name='*CHANNELULONG*VO +LUMELEVEL'))" /enablewttlogging /outputFolder:\Data\Test\Result /logf +ile:results.wtl</commandline><workingdir>\Data</workingdir><result st +arttime="9/5/2012 7:44:00 PM" executiontime="00:09:05.5944478" execut +ionoutcome="Succeeded" testoutcome="Passed" /><testcases total="609" +passed="609" failed="0" skipped="0" blocked="0" /><outputdirectory><l +ocalpath>C:\Users\hivinod\TShell\Results\Submission\0905-194337\8ff67 +565\9a1f022d\1</localpath><remotepath>\\BDCLAB-WM7-10\C$\Users\hivino +d\TShell\Results\Submission\0905-194337\8ff67565\9a1f022d\1</remotepa +th></outputdirectory><outputs><output filename="results.wtl" fullpath +="\\BDCLAB-WM7-10\C$\Users\hivinod\TShell\Results\Submission\0905-194 +337\8ff67565\9a1f022d\1\results.wtl" /></outputs><analyses /></task>< +task name="Cleanup Dependencies" id="23e9c221-d748-49c3-9ee3-c6c3a313 +94b6"><result starttime="9/5/2012 7:53:06 PM" executiontime="00:00:00 +" executionoutcome="Succeeded" /></task></tasks></workflow>
I need to extract the below items WorkflowName= Total="609" Passed="609" Failed="0" Skipped="0" Blocked="0" No sure if to use regex or xml module, I've starter in both in this context, kindly help me in this regard, Thanks Guru

Replies are listed 'Best First'.
Re: XML Parsing
by toolic (Bishop) on Sep 21, 2012 at 12:58 UTC
    As formatted, your data is hard to understand. I tried to make it presentable using 2 tools (XML::Tidy and xmllint), but both returned errors.

    Therefore, I will only offer the general advice to use an XML parser. I have been happy with XML::Twig. It has a tutorial, and there are many code examples here at the Monastery (and elsewhere).

Re: XML Parsing
by blue_cowdawg (Monsignor) on Sep 21, 2012 at 13:58 UTC

    AUUGH! My eyes! :-D

    My favorite XML parser is XML::Simple. Having said that I tried parsing this with a quick and dirty script and it would seem there is something very wrong with that XML... or it could be me.

    I attempted to clean up the XML and failed miserably. You might want to review its structure and make sure you have matching closures for all your attributes. Sample error I got:

    could not find ParserDetails.ini in /usr/lib/perl5/site_perl/5.14/XML/ +SAX Name <+passed> does not match NameChar production [Ln: 8, Col: 4859721 +220]

    Peter L. Berghold -- Unix Professional
    Peter -at- Berghold -dot- Net; AOL IM redcowdawg Yahoo IM: blue_cowdawg


      His XML doesn't have last '</workflows>' tag.

Re: XML Parsing
by tobyink (Abbot) on Sep 21, 2012 at 14:18 UTC
    use 5.010; use XML::LibXML; my $xml = XML::LibXML->load_xml(IO => \*DATA); foreach my $workflow ($xml->findnodes('//workflow')) { say $workflow->{name}; my ($testcases) = $workflow->findnodes('./testcases'); for (qw/ total passed failed skipped blocked /) { say "\t$_: $testcases->{$_}"; } } __DATA__ <workflows> <workflow name="Generic Audio Driver Tests" id="8ff67565-7954-4a6c +-84b4-6404af4a6357" owneralias="louiscl" featurepath="$\UX Platform\A +udio\Driver" jobid="60112"> <result starttime="9/5/2012 7:43:37 PM" executiontime="00:09:2 +9.0354215" executionoutcome="Succeeded" testoutcome="Passed" /> <testcases total="609" passed="609" failed="0" skipped="0" blo +cked="0" /> <tasks> <task name="Deploy Packages" id="975e822e-c4de-4cfd-9748-7 +d08ff5cfee0"> <result starttime="9/5/2012 7:43:37 PM" executiontime= +"00:00:23.3509657" executionoutcome="Succeeded" /> <outputdirectory> <localpath>C:\Users\hivinod\TShell\Results\Submiss +ion\0905-194337\8ff67565\975e822e\1</localpath> <remotepath>\\BDCLAB-WM7-10\C$\Users\hivinod\TShel +l\Results\Submission\0905-194337\8ff67565\975e822e\1</remotepath> </outputdirectory> <outputs> <output filename="Microsoft.Phone.Test.UXPlatform. +AVCore.AudioCore.Driver_DeployTest.log" fullpath="\\BDCLAB-WM7-10\C$\ +Users\hivinod\TShell\Results\Submission\0905-194337\8ff67565\975e822e +\1\Microsoft.Phone.Test.UXPlatform.AVCore.AudioCore.Driver_DeployTest +.log" /> </outputs> </task> <task name="Execute Test Harness" id="9a1f022d-dd77-48f8-b +3a3-60d0c70e1594"> <commandline>TestLauncher -b te.exe -w "C:\data\test\r +esult" -la -a \Data\Test\bin\gaudit_mc.dll /select:"not((@Name='*KS +PROPERTY_PIN_NAME*')or(@Name='*INVALIDVALUE_AUDIO_MUX_SOURCE')or(@Nam +e='*2_CONNECTION_DATAFORMAT')or(@Name='*3_CONNECTION_DATAFORMAT')or(@ +Name='*CHANNELULONG*VOLUMELEVEL'))" /enablewttlogging /outputFolder:\ +Data\Test\Result /logfile:results.wtl</commandline> <workingdir>\Data</workingdir> <result starttime="9/5/2012 7:44:00 PM" executiontime= +"00:09:05.5944478" executionoutcome="Succeeded" testoutcome="Passed" +/> <testcases total="609" passed="609" failed="0" skipped +="0" blocked="0" /> <outputdirectory> <localpath>C:\Users\hivinod\TShell\Results\Submiss +ion\0905-194337\8ff67565\9a1f022d\1</localpath> <remotepath>\\BDCLAB-WM7-10\C$\Users\hivinod\TShel +l\Results\Submission\0905-194337\8ff67565\9a1f022d\1</remotepath> </outputdirectory> <outputs> <output filename="results.wtl" fullpath="\\BDCLAB- +WM7-10\C$\Users\hivinod\TShell\Results\Submission\0905-194337\8ff6756 +5\9a1f022d\1\results.wtl" /> </outputs> <analyses /> </task> <task name="Cleanup Dependencies" id="23e9c221-d748-49c3-9 +ee3-c6c3a31394b6"> <result starttime="9/5/2012 7:53:06 PM" executiontime= +"00:00:00" executionoutcome="Succeeded" /> </task> </tasks> </workflow> </workflows>
    perl -E'sub Monkey::do{say$_,for@_,do{($monkey=[caller(0)]->[3])=~s{::}{ }and$monkey}}"Monkey say"->Monkey::do'

      With Twig.

      my $twig=XML::Twig->new( twig_handlers =>{ #'//*[total="609" and passed="609"]' =>sub { # i wonder why th +is doesn't work..? '//*[@total="609"]' =>sub { my ($twig,$elt)=@_; last if ( $elt->att("passed") ne "609"); last if ( $elt->att("skipped") ne "0"); last if ( $elt->att("blocked") ne "0"); if ($elt->parent->gi eq 'workflow' ){ $elt->parent->print; } } }, PrettyPrint=>'indented', )->parsefile('xml_without_error.xml');

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://994888]
Approved by Ratazong
[Corion]: Hmmm. I feel a Meditation coming on. I wrote a module, DBIx::PivotQuery, which returns a table-like set of rows (AoA) but some columns are generated from column values, like in an (Excel) pivot table or a ROLLUP command
[Corion]: My current approach for subtotals involves rerunning the given query, with the hint to the user that they should use a temporary table if they want better performance.
[Corion]: But I could create that temporary table in the module and use it for the improved perfomance directly instead.
[Corion]: And the question is, what would be better/preferred ;-)
[Corion]: Hmm - not exactly like the ROLLUP command. Ah well.

How do I use this? | Other CB clients
Other Users?
Others pondering the Monastery: (10)
As of 2017-02-23 15:26 GMT
Find Nodes?
    Voting Booth?
    Before electricity was invented, what was the Electric Eel called?

    Results (347 votes). Check out past polls.