Beefy Boxes and Bandwidth Generously Provided by pair Networks
We don't bite newbies here... much
 
PerlMonks  

XML Parser and certain nodes to a hash

by Dr Manhattan (Beadle)
on May 23, 2013 at 09:06 UTC ( #1034911=perlquestion: print w/ replies, xml ) Need Help??
Dr Manhattan has asked for the wisdom of the Perl Monks concerning the following question:

Hi all

I am trying to parse a multi-level XML document, and then adding certain nodes to a hash. The problem is that with the many different levels I am struggling with the referencing of the nodes that I want to add

This is the code I have

#!/usr/bin/perl use XML::Simple; use Data::Dumper; use utf8; open (INPUT, "<:utf8", "w.xml") or die "Can't open"; my $xml = new XML::Simple (KeyAttr=>[]); my $afWN = $xml->XMLin("wnafr.xml"); open (OUTPUT, ">:utf8", "Output.txt") or die "Can't open"; print OUTPUT Dumper($W); foreach my $e (@{$W->{XML}}) { print "I.D.:", $e->{ID}, "\n"; print "Part of Speech: ", $e->{POS}, "\n"; print "Literal: ", "$e->{SYNONYM}{LITERAL}{content}", "\n"; print "\n"; } close (OUTPUT);

The XML looks like this

<XML><ID>ENG20-00001740-a</ID><POS>a</POS><SYNONYM><LITERAL sense="1">in staat</LITERAL><WORD>in</WORD><WORD>staat</WORD></SYNONYM><ILR type="be_in_state">ENG20-05295659-n</ILR><ILR type="be_in_state">ENG20-04904666-n</ILR><ILR type="near_antonym">ENG20-00002062-a</ILR><DEF/><USAGE/><BCS>3</BCS><DOMAIN>quality</DOMAIN><SUMO type="=">Breathing</SUMO></SYNSET></XML>

<XML><ID>ENG20-00001740-v</ID><POS>v</POS><SYNONYM><LITERAL sense="1">asem</LITERAL><WORD>asem</WORD><LITERAL sense="1">respireer</LITERAL><WORD>respireer</WORD><LITERAL sense="1">asem skep</LITERAL><WORD>asem</WORD><WORD>skep</WORD><LITERAL sense="1">asemhaal</LITERAL><WORD>asemhaal</WORD></SYNONYM><ILR type="verb_group">ENG20-00002536-v</ILR><ILR type="verb_group">ENG20-00002307-v</ILR><ILR type="also_see">ENG20-00004923-v</ILR><ILR type="also_see">ENG20-00004127-v</ILR><ILR type="subevent">ENG20-00004923-v</ILR><ILR type="subevent">ENG20-00004127-v</ILR><DEF/><USAGE/><BCS>3</BCS><DOMAIN>medicine</DOMAIN><SUMO type="=">Breathing</SUMO></SYNSET></XML>

I want to put all the Literal->content in a separate hash, but the problem is that there can occur more that 1. If there is more than 1 Literal->content the XML parser stores them in a array of hashes. If there is only 1 Literal->content it is simply stores as a hash.

Any ideas?

Comment on XML Parser and certain nodes to a hash
Select or Download Code
Re: XML Parser and certain nodes to a hash
by Anonymous Monk on May 23, 2013 at 09:11 UTC
Re: XML Parser and certain nodes to a hash
by hdb (Prior) on May 23, 2013 at 09:23 UTC
Re: XML Parser and certain nodes to a hash
by marto (Bishop) on May 23, 2013 at 09:26 UTC
    open (INPUT, "<:utf8", "w.xml") or die "Can't open";

    You're not printing $! when attempting to open files, if you include it you'll find out why things fail:

    open (INPUT, "<:utf8", "w.xml") or die "Can't open w.xml: $!";
Re: XML Parser and certain nodes to a hash
by Jenda (Abbot) on May 24, 2013 at 15:15 UTC

    <quote>"parse a multi-level XML document, and then adding certain nodes to a hash"</quote> ... this is a textbook case for XML::Rules.

    Jenda
    Enoch was right!
    Enjoy the last years of Rome.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://1034911]
Approved by Corion
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others pondering the Monastery: (10)
As of 2015-07-06 06:06 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    The top three priorities of my open tasks are (in descending order of likelihood to be worked on) ...









    Results (70 votes), past polls