Order Perl & XML

Item Description: (see title)

Review Synopsis: A Road Map to Processing XML with Perl

Perl & XML is hard book to categorize - it is not a beginner's book and it is not a cookbook. I instead found it to be a nice road map to the many XML processing CPAN modules available to Perl programmers. (And i also found it to be a nice departure to the many XML books available that are only for Java). This is YAGOB (Yet Another Good O'Reilly Book), nice typesetting, thorough explanations, and a quirky animal cover. The index is decent, but i was a bit disheartened to find that XML::Twig was not included in the index. It is, however, covered in chapter 8.

The first chapter is the obligatory introduction, it introduces XML::Simple and discusses 'XML Gotchas'. Chapter two provides a very nice overview of XML in general. It provides the necessary base for XML newbies while providing a decent reference to refer to while working through the rest of the book. It also gives an example of an XSLT transformation - converting an XML document to an XHTML document without the help of Perl.

The fun starts with chapter three where actual XML processing is discussed and demonstrated. The CPAN modules XML::Parser, XML::LibXML, XML::XPath, and XML::Writer are given brief introductions with sample code. Also included is a demonstration of the wrong way to write an XML Parser (a well-formedness checker by hand) and the right way (by using XML::Parser). Document validation and DTD's are introduced with XML::LibXML code as a demonstration, and finally, Unicode encodings are compared and contrasted.

Chapters four and six cover event-based and tree-based parsing respectively. Chapter four goes into more detail with XML::Parser and discusses 'repackaging' XML as PYX via XML::PYX. Chapter six discusses XML::Parser yet again (along with XML::Simple) and introduces XML::SimpleObject, XML::TreeBuilder, and XML::Grove. Each module covered is given a good overview and sample code to help demonstrate.

Chapters five and seven cover the SAX and DOM modules respectively. (I recommend reading chapters four and six before covering five and seven.) An example of converting Excel spreadsheets to XML via XML::SAXDriver::Excel is covered in chapter five as well as SAX2 and installing your own XML::SAX parsers via the h2xs utility. The majority of chapter seven is a DOM class interface reference. There are two examples in this chapter, one that processes an XHTML document with XML::DOM and one that works with DOM2 and namespaces via XML::LibXML.

Chapter eight discusses how to make tree-based parsing faster and more efficient via a hand-rolled DOM iterator module (named XML::DOMIterator) that is used in conjunction with XML::DOM, and also revisits the 'node hunter' module XML::XPath. Also included is mirod's XML::Twig which is used in three examples, one of which shows how tree-based parsing can be optimized by only parsing the smallest part of the tree that needs to be parsed. XSLT is also given a more thorough discussion than the overview given in chapter one, including how it can be used in conjunction with Perl via XML::LibXSLT.

Chapters nine and ten round the book off with application examples. Chapter nine covers RSS with XML::RSS and briefly discusses XML::Generator::DBI (but makes no mention of DBIx::XML_RDB - see mirod's comment below). It also briefly discusses the controversial SOAP::Lite. Chapter ten provides an application that subclasses an XML parser to provide an API via CGI for manipulating an XML document. Also included is a mod_perl application for converting DocBook files into HTML on the fly, as well as a discussion, solution, and work-around involving the pitfalls of using the Expat library in mod_perl.

The only cons i found were a few typos dealing with the ampersand character. Sometimes you will find & when the authors meant & and vice versa. Any seasoned Perl programmer will immediately spot these typos, but some beginners might not. Another con is that the authors discuss XML::Writer, but fail to use it in many examples that write XML. They instead do so by hand, which contradicts using CPAN modules in the first place. Again, a seasoned Perl programmer will know better. The last con is a minor nit-pick: a lot of the code seemed somewhat Java-like to me. However, these cons weigh considerably less than the pros. Again, i recommend this book to any seasoned Perl programmer that has not yet entered the realm of XML processing.

Overall i feel this is an excellent book for intermediate to advanced Perl programmers with little or no knowledge of XML processing. Tired of only knowing how to use XML::Simple? This book will show you the alternatives!

Other Reviews:

  1. mirod's use Perl Review
  2. O'reilly Reader Reviews
  3. as well as reviews from your favorite online book vendor