Beefy Boxes and Bandwidth Generously Provided by pair Networks
Keep It Simple, Stupid
 
PerlMonks  

XML Parser and certain nodes to a hash

by Dr Manhattan (Beadle)
on May 23, 2013 at 09:06 UTC ( #1034911=perlquestion: print w/ replies, xml ) Need Help??
Dr Manhattan has asked for the wisdom of the Perl Monks concerning the following question:

Hi all

I am trying to parse a multi-level XML document, and then adding certain nodes to a hash. The problem is that with the many different levels I am struggling with the referencing of the nodes that I want to add

This is the code I have

#!/usr/bin/perl use XML::Simple; use Data::Dumper; use utf8; open (INPUT, "<:utf8", "w.xml") or die "Can't open"; my $xml = new XML::Simple (KeyAttr=>[]); my $afWN = $xml->XMLin("wnafr.xml"); open (OUTPUT, ">:utf8", "Output.txt") or die "Can't open"; print OUTPUT Dumper($W); foreach my $e (@{$W->{XML}}) { print "I.D.:", $e->{ID}, "\n"; print "Part of Speech: ", $e->{POS}, "\n"; print "Literal: ", "$e->{SYNONYM}{LITERAL}{content}", "\n"; print "\n"; } close (OUTPUT);

The XML looks like this

<XML><ID>ENG20-00001740-a</ID><POS>a</POS><SYNONYM><LITERAL sense="1">in staat</LITERAL><WORD>in</WORD><WORD>staat</WORD></SYNONYM><ILR type="be_in_state">ENG20-05295659-n</ILR><ILR type="be_in_state">ENG20-04904666-n</ILR><ILR type="near_antonym">ENG20-00002062-a</ILR><DEF/><USAGE/><BCS>3</BCS><DOMAIN>quality</DOMAIN><SUMO type="=">Breathing</SUMO></SYNSET></XML>

<XML><ID>ENG20-00001740-v</ID><POS>v</POS><SYNONYM><LITERAL sense="1">asem</LITERAL><WORD>asem</WORD><LITERAL sense="1">respireer</LITERAL><WORD>respireer</WORD><LITERAL sense="1">asem skep</LITERAL><WORD>asem</WORD><WORD>skep</WORD><LITERAL sense="1">asemhaal</LITERAL><WORD>asemhaal</WORD></SYNONYM><ILR type="verb_group">ENG20-00002536-v</ILR><ILR type="verb_group">ENG20-00002307-v</ILR><ILR type="also_see">ENG20-00004923-v</ILR><ILR type="also_see">ENG20-00004127-v</ILR><ILR type="subevent">ENG20-00004923-v</ILR><ILR type="subevent">ENG20-00004127-v</ILR><DEF/><USAGE/><BCS>3</BCS><DOMAIN>medicine</DOMAIN><SUMO type="=">Breathing</SUMO></SYNSET></XML>

I want to put all the Literal->content in a separate hash, but the problem is that there can occur more that 1. If there is more than 1 Literal->content the XML parser stores them in a array of hashes. If there is only 1 Literal->content it is simply stores as a hash.

Any ideas?

Comment on XML Parser and certain nodes to a hash
Select or Download Code
Re: XML Parser and certain nodes to a hash
by Anonymous Monk on May 23, 2013 at 09:11 UTC
Re: XML Parser and certain nodes to a hash
by hdb (Parson) on May 23, 2013 at 09:23 UTC
Re: XML Parser and certain nodes to a hash
by marto (Chancellor) on May 23, 2013 at 09:26 UTC
    open (INPUT, "<:utf8", "w.xml") or die "Can't open";

    You're not printing $! when attempting to open files, if you include it you'll find out why things fail:

    open (INPUT, "<:utf8", "w.xml") or die "Can't open w.xml: $!";
Re: XML Parser and certain nodes to a hash
by Jenda (Abbot) on May 24, 2013 at 15:15 UTC

    <quote>"parse a multi-level XML document, and then adding certain nodes to a hash"</quote> ... this is a textbook case for XML::Rules.

    Jenda
    Enoch was right!
    Enjoy the last years of Rome.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://1034911]
Approved by Corion
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others examining the Monastery: (7)
As of 2014-07-12 00:19 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    When choosing user names for websites, I prefer to use:








    Results (237 votes), past polls