good chemistry is complicated,
and a little bit messy -LW
Re: (RFC) XML::Rules - yet another XML parserby trwww (Priest)
|on Nov 06, 2006 at 00:08 UTC||Need Help??|
I've done stuff like this very often.
When I see this, I see a generic state maintainence mechanism for SAX.
SAX is great. It is sometimes the only option when you have documents that are too big to fit in to RAM. But because it provides nothing more than a dispatch mechanism for document particles, maintaining state between the different callbacks can get tricky, boring, and error prone.
As an example, refer to Kip Hampton's excellent xml.com article High-Performance XML Parsing With SAX.
In it, he uses an XML document that represents a series of emails that need sent as the sample data. He uses SAX to build up an argument list for Mail::Sendmail. When the end_element callback is fired for the record, he has Mail::Sendmail send the email.
The relevant part here is note how much code he has to write to extract the data from the individual record.
According to the POD in your module:
Or you could view it as yet another event based XML parser that differs from all the others only in two things. First that it only let's you hook your callbacks to the closing tags. And that stores the data for you so that you do not have to use globals or closures and wonder where to attach the snippet of data you just received onto the structure you are building
what you have would help quite a bit in maintaining the state of an XML record.
If you dont mind, what would the rules for your module look like to perform the same task Hampton did with SAX?
Your module would be very useful implemented as a SAX handler so users could take advantage of the many features of SAX (swappable parsers, chained filters/handlers, document writers, standardized interface). Imagine the process you describe aboive as a web serice. You could use a SAX writer in the same pipeline to build up the response for the client request.
Regardless of how you proceed, I'll definitely keep an eye on it. I know I'll use it.