It might be difficult but I'll try anyway ;--).
At least here are a few hints:
- document size: big documents excludes most tree-oriented modules,
such as XML::Simple, XML::DOM and XML::XPath
big depends on your RAM and on the expansion factor of the module, typically
between 7 an 10
- type of XML: document-oriented XML excludes modules such as XML::Simple
and XML::SimpleObjects
those modules don't deal with mixed content
(<p>this is <b>mixed</b> content</p>),
- ease of use: although this is higly subjective XML::Simple seems to be
considered really easy to use as it completely masks the XML by loading it into
a Perl data structure (a pretty convoluted data-structure IMHO, use Data::Dumper!), tree-based modules (XML::XPath, XML::DOM, XML::Twig)
are generally easier to use than stream-based ones, although for simple data
extraction XML::PYX is very convenient,
- speed: at the moment XML::Parser is the fastest (all other modules
are based on it) but modules based on libXML should be faster soon (XML::XPath
2.0 for example). Stream-based modules are usually faster than tree-based ones,
| [reply] [d/l] |
You can also have a look at the Module Reviews for XML modules and Ways to Rome, an article that solves the same problem using various XML modules.
The problem is that there is a lot of overlap between the various modules. Some cannot be used in certain circumstances, but for any particular problem there are at least 2 or more modules that will work. Basically it boils down to how much you like the interface of any module.
A quick overview would be:
- XML::Parser: the basic, most of the other modules are built on top of it, fast, low-level (can be a pain to use),
- XML::Simple: quite simple, robust, widely-used, tree-based (hence can be slow on big files and cannot deal with huge ones), does not work for document-oriented XML,
- XML::DOM: ugly, tree-oriented, widely used, not actively maintained at the moment, follows a W3C standard, can be a pain to install (BTW, if you are interested by the DOM I have started writing a little helper module for it, named... XML::DOM::Twig),
- XML::PYX: line-oriented, fast, not convenient for complex transformations,
- XML::XPath: powerful, getting faster and faster, very well supported (by Matt Sergeant, the most prolific XML developper around),
- XML::Twig: Perlish, DWIMy, can deal with huge documents, you know what I think of it ;--)
There are others too: XML::RAX for record-oriented XML, XML::Dt, XML::SimpleObjects...
In any case I think we're heading towards big changes in the XML module landscape. XML::Parser is not a SAX-based parser (it predates SAX actually), which is a pain, and it is quite a pain to install (based on expat, an external library). I think we will see new modules based either on a pure Perl SAX parser (there is one in SOAP::Lite) or on libXML, the Gnome XML library, plus existing modules being ported to interface with those 2 kinds of SAX parsers.
So I guess it will always be very difficult to give a "decision-tree" to choose a module, and in any case it is too early...
| [reply] |
All you seek is here and here
Both are articles at ORA xml.com site. They are excellent!
mitd-Made in the Dark
'My favourite colour appears to be grey.'
| [reply] |