Beefy Boxes and Bandwidth Generously Provided by pair Networks
more useful options
 
PerlMonks  

XML Parsing

by orthanc (Monk)
on Apr 12, 2000 at 15:12 UTC ( #7372=perlquestion: print w/ replies, xml ) Need Help??
orthanc has asked for the wisdom of the Perl Monks concerning the following question:

Fellow Perl Hackers,
I want to parse an XML file and pick out the data so I can verify it, after which I need to place the data back into the XML.
I've tried with XML::Simple::XMLin which returns a nice hash but I need a method to traverse this so I can keep the same hierarchy.
It would be nice to use this hash as I can then use XMLout to regen the XML.

Is this a good method or is XML::Parser better?

Any help appreciated

Comment on XML Parsing
Re: XML Parsing
by chromatic (Archbishop) on Apr 12, 2000 at 19:04 UTC
    If you want to keep the XML in a specific order, you can use XML::Parser and treat the data as a stream. It looks like you pass your XML::Parser some callback functions to activate when it hits the start of a tag, the end of a tag, and non-markup text.

    Most of the stuff I'd keep in XML doesn't need to maintain a strictly hierarchical tree, so the simplest (and smallest) parser is best for my purposes. If you can use a tied hash with XML::Simple::XMLin, that is another option. (It acts like a hash, but it keeps the order consistent.)

Re: XML Parsing
by btrott (Parson) on Apr 12, 2000 at 19:46 UTC
    XML::Parser is definitely the more powerful of the two options, but it's more complex to use. One way to use it is to process the XML as a stream; you give the parser some callback functions to use as handlers, and those functions get called each time a specific "event" occurs in the parser: the start of a tag, the end of a tag, character (non-markup) data. XML::Parser also has "styles" that make it easier to use (like a "Tree" style that loads your data into a tree--although don't expect a nice hash like XML::Simple gives you).

    XML::Simple returns a hash reference, so couldn't you write a recursive routine that descends through that hash and does whatever validation you want on it?

    There are a bunch of XML modules, and here's a really good article about them: Processing XML with Perl.

    From what you said, you may be best off just going with XML::Simple and writing a sub to recurse through the hash.

Re: XML Parsing
by mikfire (Deacon) on Apr 12, 2000 at 21:30 UTC
    If I have read the question correctly, order is important to you - ie, you need to preserve the order in which the tags are read.

    According to the documentation I have read, you are right. XML::Simple will not do what you want but XML::Parser will. Further, it appears XML::Parser can be used in Table mode and will return an array much like what you appear to be asking for - saving you the pain of dealing with the Stream mode directly.

    I found a good series of articles at Perl Month and, more to your problem, this article

    Mik
    Mik Firestone ( perlus bigotus maximus )

      I think you mean Tree mode, not Table mode. (There's also an "Object" mode that returns a tree of Perl objects, named based on the element names.)

      The only bad thing about Tree mode is that the resultant data structure is rather difficult to understand, in my opinion. It's certainly not going to give you a nice hash to play around with. :) But if you feel like figuring out how the tree is structured, that may be your best option.

      There are some other modules that use tree representations: XML::Grove, XML::Twig, and XML::DOM. These are all discussed in that article I posted before.

        Dear Friends, I have an XML file thatís encoded in ISO-8859-1. I have some European characters coming in from 2 fields (Name, Comments) in the XML file. Can anyone suggest if there are any functions in Perl to read those characters? Using Perl can I parse this xml file? Please suggest. I have not done this before. Regards, Madhavi.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://7372]
Approved by root
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others chilling in the Monastery: (5)
As of 2014-12-28 12:01 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    Is guessing a good strategy for surviving in the IT business?





    Results (181 votes), past polls