Beefy Boxes and Bandwidth Generously Provided by pair Networks
Don't ask to ask, just ask
 
PerlMonks  

Parsing xml using libXML

by shaq (Initiate)
on Jul 12, 2013 at 12:36 UTC ( #1043959=perlquestion: print w/ replies, xml ) Need Help??
shaq has asked for the wisdom of the Perl Monks concerning the following question:

I need to pars an xmlfile,using perl. So basically I want to convert this xml file to some data structure like hashes and arrays.by this code now I can have access to the all nodes except for graphics node and its elements.

XML file:

<pathway name="path:ko00010" org="ko" number="00010" title="Glycolysis / Gluconeogenesis" image="http://www.kegg.jp/kegg/pathway/ko/ko00010.png" link="http://www.kegg.jp/kegg-bin/show_pathway?ko00010"> <entry id="13" name="ko:K01623 ko:K01624 ko:K01622 ko:K16306" link="http://www.kegg.jp/dbgetbin/www_bgetK01623+K16306"> <graphics name="K01623..." fgcolor="#000000" bgcolor="#BFBFFF" type="rectangle" x="483" y="404" width="46" height="17"/> </entry> </pathway>

here is the perl code :

+ + + + use XML::LibXML; use strict; use warnings; my $parser = new XML::LibXML; my $xmlp= $parser -> parse_file("ko00010.xml"); my $rootel = $xmlp -> getDocumentElement(); my $elname = $rootel -> getName(); my @rootelements=$rootel -> getAttributes(); foreach my $rootatt(@rootelements){ my $name = $rootatt -> getName(); my $value = $rootatt -> getValue(); print " ${name}[$value]\n "; } my @kids = $rootel -> childNodes(); foreach my $child(@kids) { my $elname = $child -> getName(); my @atts = $child -> getAttributes(); foreach my $at (@atts) { my $name = $at -> getName(); my $value = $at -> getValue(); print " ${name}[$value]\n "; } }

Comment on Parsing xml using libXML
Select or Download Code
Re: Parsing xml using libXML
by hippo (Deacon) on Jul 12, 2013 at 13:05 UTC

    If you really want to roll your own like this, it would seem sensible to use recursion. You are not doing so and are in fact only going to an explicit depth in the tree at any point, hence missing the "graphics" element (and its children if there were any).

      well I don't know how to proceed

        And why might that be? Here are some possibilities:

        1. You don't know what recursion is.
        2. You do know what recursion is, but have no idea how to do it in perl.
        3. You know what it is and how to do it, but cannot see how it applies to your problem.
        4. You've decided it would be better not to roll your own but don't know how to look for appropriate existing code.
        5. You've found some existing code but don't know how to implement it in your script.

        If you want to pick a number that would give us a bit more idea of why you are stuck and hence how you can be assisted.

Re: Parsing xml using libXML
by Anonymous Monk on Jul 12, 2013 at 14:18 UTC

    Whats the question, how to build some perl data structure? What kind are you trying to build?

      I need to store this xml data into data structures for further use. DS like hash and arrays, So far I have access to all elements except for Graphics nodes and its children

        I need to store this xml data into data structures for further use. DS like hash and arrays, So far I have access to all elements except for Graphics nodes and its children

        So what data structure? You've posted sample data, now post the corresponding data structure

Re: Parsing xml using libXML
by runrig (Abbot) on Jul 12, 2013 at 16:44 UTC
    Something like XML::Rules would build your data structure fairly easily the way you want it, but since I don't know what you want, I don't know what to write. You can start with XML::Simple and decide if that's good enough, or what you don't like, and then switch to XML::Rules.

      I need to store this xml data into data structures for further use. DS like hash and arrays, So far I have access to all elements except for Graphics nodes and its children

Re: Parsing xml using libXML
by zork42 (Monk) on Jul 14, 2013 at 14:18 UTC
    The OP includes this text:
    here is the perl code :
    use XML::LibXML;
    But unfortunately line 1 in the OP is actually ~335 space characters.

    This is messing up the page formatting both within this topic, and on the SoPW page on which this topic appears.

    Please could someone with god-like powers, or shaq, replace the long line 1 with a short line 1 (so the line numbering remains the same)?

    Thank you!
        Hey Anonymous Monk thanks for that very helpful post :)

        I've now ticked "Auto Code Wrapping" and everything looks fine now :)

        Should I still /msg a janitor to get this fixed for other people?
        (shaq is maybe too new to fix it)

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://1043959]
Approved by toolic
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others scrutinizing the Monastery: (19)
As of 2014-07-23 19:16 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    My favorite superfluous repetitious redundant duplicative phrase is:









    Results (151 votes), past polls