http://www.perlmonks.org?node_id=949976

lcheung has asked for the wisdom of the Perl Monks concerning the following question:

I'm trying to parse an XML like the one below with XML::Simple. I'm not sure what loop I should be using, I thought about using a for loop but it will only parse all the <level> but not the <study> (<study> is the lowest level but can appear at any level in the XML as you can see)
for my $level1_ref (@{$report_config->{"levels"}->{"level"}}) { print... }

Here's the XML

<config> <levels name="a" target="29.6"> <level name="b" weight="50" target="35.1"> <study id="32" name="i" weight="30" /> <study id="36" name="j" weight="70" /> </level> <study id="37" name="k" weight="15" /> <level name="c" weight="13" target="22.1"> <level name="d" weight="69.2"> <level name="e" weight="44.4"> <study id="34" name="l" weight="50" /> <study id="27" name="m" weight="50" /> </level> <level name="f" weight="55.6"> <study id="25" name="n" weight="60" /> <study id="38" name="o" weight="40" /> </level> </level> <level name="g" weight="30.8"> <study id="50" name="p" weight="75" /> <study id="70" name="q" weight="25" /> </level> </level> <level name="h" weight="22" target="19.0"> <study id="40" name="r" weight="90.9" /> <study id="22" name="s" weight="9.1" /> </level> </levels> <study id="19" name="t" weight="0" spec="p0f7=2" /> <study id="19" name="u" weight="0" spec="p0f7=1" /> <study id="62" name="v" weight="0" /> </config>

cheers

Replies are listed 'Best First'.
Re: parsing multi level XML with XML::Simple
by dasgar (Priest) on Jan 26, 2012 at 05:25 UTC

    When using XML::Simple, it's usually very helpful to use something like Data::Dumper to help understand the structure of the data. From my personal experience, attributes are usually hash keys and multiple tags at a level are usually arrays.

    Here's the code that I came up with to parse your XML data and run it through Data::Dumper.

    use strict; use warnings; use XML::Simple; use Data::Dumper; my $file = "data.xml"; my $xml = XMLin($file); print Dumper($xml);

    And here's the output.

    That output should help you figure out how to write your code to traverse the data structure. If that's not a data structure that you're liking, you can check out the available options for the XMLin function or you might need to check out other XML parsing modules. Since I personally have only used XML::Simple, I can't really recommend any other modules to try.

Re: parsing multi level XML with XML::Simple
by ikegami (Patriarch) on Jan 25, 2012 at 22:06 UTC

    You have a recursive structure. Recursion is the obvious method of handling this. I'd help you with the implementation, but you didn't specify what information you want to extract (i.e. what output you expect).

Re: parsing multi level XML with XML::Simple
by tobyink (Canon) on Jan 26, 2012 at 13:32 UTC

    In my experience, XML::Simple usually causes more trouble than it's worth. Its default behaviour is too smart for its own good. I used to always use XML::Simple, but now I've seen the light and haven't used it for years.

    use XML::LibXML; my $dom = XML::LibXML->load_xml(IO => \*DATA); foreach my $level ($dom->findnodes('//level | //levels')) { printf("==== level '%s' ====\n", $level->getAttribute('name')); foreach my $child ($level->getChildrenByTagName('*')) { printf("contains %s '%s'\n", $child->nodeName, $child->getAttr +ibute('name')); } print "\n"; } __DATA__ <config> <levels name="a" target="29.6"> <level name="b" weight="50" target="35.1"> <study id="32" name="i" weight="30" /> <study id="36" name="j" weight="70" /> </level> <study id="37" name="k" weight="15" /> <level name="c" weight="13" target="22.1"> <level name="d" weight="69.2"> <level name="e" weight="44.4"> <study id="34" name="l" weight="50" /> <study id="27" name="m" weight="50" /> </level> <level name="f" weight="55.6"> <study id="25" name="n" weight="60" /> <study id="38" name="o" weight="40" /> </level> </level> <level name="g" weight="30.8"> <study id="50" name="p" weight="75" /> <study id="70" name="q" weight="25" /> </level> </level> <level name="h" weight="22" target="19.0"> <study id="40" name="r" weight="90.9" /> <study id="22" name="s" weight="9.1" /> </level> </levels> <study id="19" name="t" weight="0" spec="p0f7=2" /> <study id="19" name="u" weight="0" spec="p0f7=1" /> <study id="62" name="v" weight="0" /> </config>
      Yes. And once you get tired of typing arrows and long method names, you can switch to XML::XSH2:
      open file.xml ; for (//level | //levels) { echo ==== level @name ==== ; for * echo contains name() @name ; }
Re: parsing multi level XML with XML::Simple
by Jenda (Abbot) on Jan 26, 2012 at 09:51 UTC

    If the datastructure as printed by Data::Dumper does't look convenient, you may tweak it by the XML::Simple's options. Do look especially at KeyAttr and ForceArray. Another option is to use XML::Rules ... it'll give you even more control. And, if needed, it'll let you handle the XML as it's being parsed. See Simpler than XML::Simple.

    Jenda
    Enoch was right!
    Enjoy the last years of Rome.

Re: parsing multi level XML with XML::Simple
by lcheung (Initiate) on Jan 25, 2012 at 21:47 UTC

    perhaps I should be using a while loop? examples? Thanks!