Beefy Boxes and Bandwidth Generously Provided by pair Networks
Think about Loose Coupling
 
PerlMonks  

parsing multi level XML with XML::Simple

by lcheung (Initiate)
on Jan 25, 2012 at 21:44 UTC ( [id://949976]=perlquestion: print w/replies, xml ) Need Help??

lcheung has asked for the wisdom of the Perl Monks concerning the following question:

I'm trying to parse an XML like the one below with XML::Simple. I'm not sure what loop I should be using, I thought about using a for loop but it will only parse all the <level> but not the <study> (<study> is the lowest level but can appear at any level in the XML as you can see)
for my $level1_ref (@{$report_config->{"levels"}->{"level"}}) { print... }

Here's the XML

<config> <levels name="a" target="29.6"> <level name="b" weight="50" target="35.1"> <study id="32" name="i" weight="30" /> <study id="36" name="j" weight="70" /> </level> <study id="37" name="k" weight="15" /> <level name="c" weight="13" target="22.1"> <level name="d" weight="69.2"> <level name="e" weight="44.4"> <study id="34" name="l" weight="50" /> <study id="27" name="m" weight="50" /> </level> <level name="f" weight="55.6"> <study id="25" name="n" weight="60" /> <study id="38" name="o" weight="40" /> </level> </level> <level name="g" weight="30.8"> <study id="50" name="p" weight="75" /> <study id="70" name="q" weight="25" /> </level> </level> <level name="h" weight="22" target="19.0"> <study id="40" name="r" weight="90.9" /> <study id="22" name="s" weight="9.1" /> </level> </levels> <study id="19" name="t" weight="0" spec="p0f7=2" /> <study id="19" name="u" weight="0" spec="p0f7=1" /> <study id="62" name="v" weight="0" /> </config>

cheers

Replies are listed 'Best First'.
Re: parsing multi level XML with XML::Simple
by dasgar (Priest) on Jan 26, 2012 at 05:25 UTC

    When using XML::Simple, it's usually very helpful to use something like Data::Dumper to help understand the structure of the data. From my personal experience, attributes are usually hash keys and multiple tags at a level are usually arrays.

    Here's the code that I came up with to parse your XML data and run it through Data::Dumper.

    use strict; use warnings; use XML::Simple; use Data::Dumper; my $file = "data.xml"; my $xml = XMLin($file); print Dumper($xml);

    And here's the output.

    That output should help you figure out how to write your code to traverse the data structure. If that's not a data structure that you're liking, you can check out the available options for the XMLin function or you might need to check out other XML parsing modules. Since I personally have only used XML::Simple, I can't really recommend any other modules to try.

Re: parsing multi level XML with XML::Simple
by ikegami (Patriarch) on Jan 25, 2012 at 22:06 UTC

    You have a recursive structure. Recursion is the obvious method of handling this. I'd help you with the implementation, but you didn't specify what information you want to extract (i.e. what output you expect).

Re: parsing multi level XML with XML::Simple
by tobyink (Canon) on Jan 26, 2012 at 13:32 UTC

    In my experience, XML::Simple usually causes more trouble than it's worth. Its default behaviour is too smart for its own good. I used to always use XML::Simple, but now I've seen the light and haven't used it for years.

    use XML::LibXML; my $dom = XML::LibXML->load_xml(IO => \*DATA); foreach my $level ($dom->findnodes('//level | //levels')) { printf("==== level '%s' ====\n", $level->getAttribute('name')); foreach my $child ($level->getChildrenByTagName('*')) { printf("contains %s '%s'\n", $child->nodeName, $child->getAttr +ibute('name')); } print "\n"; } __DATA__ <config> <levels name="a" target="29.6"> <level name="b" weight="50" target="35.1"> <study id="32" name="i" weight="30" /> <study id="36" name="j" weight="70" /> </level> <study id="37" name="k" weight="15" /> <level name="c" weight="13" target="22.1"> <level name="d" weight="69.2"> <level name="e" weight="44.4"> <study id="34" name="l" weight="50" /> <study id="27" name="m" weight="50" /> </level> <level name="f" weight="55.6"> <study id="25" name="n" weight="60" /> <study id="38" name="o" weight="40" /> </level> </level> <level name="g" weight="30.8"> <study id="50" name="p" weight="75" /> <study id="70" name="q" weight="25" /> </level> </level> <level name="h" weight="22" target="19.0"> <study id="40" name="r" weight="90.9" /> <study id="22" name="s" weight="9.1" /> </level> </levels> <study id="19" name="t" weight="0" spec="p0f7=2" /> <study id="19" name="u" weight="0" spec="p0f7=1" /> <study id="62" name="v" weight="0" /> </config>
      Yes. And once you get tired of typing arrows and long method names, you can switch to XML::XSH2:
      open file.xml ; for (//level | //levels) { echo ==== level @name ==== ; for * echo contains name() @name ; }
Re: parsing multi level XML with XML::Simple
by Jenda (Abbot) on Jan 26, 2012 at 09:51 UTC

    If the datastructure as printed by Data::Dumper does't look convenient, you may tweak it by the XML::Simple's options. Do look especially at KeyAttr and ForceArray. Another option is to use XML::Rules ... it'll give you even more control. And, if needed, it'll let you handle the XML as it's being parsed. See Simpler than XML::Simple.

    Jenda
    Enoch was right!
    Enjoy the last years of Rome.

Re: parsing multi level XML with XML::Simple
by lcheung (Initiate) on Jan 25, 2012 at 21:47 UTC

    perhaps I should be using a while loop? examples? Thanks!

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://949976]
Approved by Corion
Front-paged by Corion
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others wandering the Monastery: (5)
As of 2024-04-24 19:08 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found