Beefy Boxes and Bandwidth Generously Provided by pair Networks
P is for Practical

comment on

( #3333=superdoc: print w/replies, xml ) Need Help??

Hi Jenda

I've done stuff like this very often.

When I see this, I see a generic state maintainence mechanism for SAX.

SAX is great. It is sometimes the only option when you have documents that are too big to fit in to RAM. But because it provides nothing more than a dispatch mechanism for document particles, maintaining state between the different callbacks can get tricky, boring, and error prone.

As an example, refer to Kip Hampton's excellent article High-Performance XML Parsing With SAX.

In it, he uses an XML document that represents a series of emails that need sent as the sample data. He uses SAX to build up an argument list for Mail::Sendmail. When the end_element callback is fired for the record, he has Mail::Sendmail send the email.

The relevant part here is note how much code he has to write to extract the data from the individual record.

According to the POD in your module:

Or you could view it as yet another event based XML parser that differs from all the others only in two things. First that it only let's you hook your callbacks to the closing tags. And that stores the data for you so that you do not have to use globals or closures and wonder where to attach the snippet of data you just received onto the structure you are building

what you have would help quite a bit in maintaining the state of an XML record.

If you dont mind, what would the rules for your module look like to perform the same task Hampton did with SAX?

Your module would be very useful implemented as a SAX handler so users could take advantage of the many features of SAX (swappable parsers, chained filters/handlers, document writers, standardized interface). Imagine the process you describe aboive as a web serice. You could use a SAX writer in the same pipeline to build up the response for the client request.

Regardless of how you proceed, I'll definitely keep an eye on it. I know I'll use it.


In reply to Re: (RFC) XML::Rules - yet another XML parser by trwww
in thread (RFC) XML::Rules - yet another XML parser by Jenda

Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":

  • Are you posting in the right place? Check out Where do I post X? to know for sure.
  • Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
    <code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
  • Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
  • Want more info? How to link or or How to display code and escape characters are good places to start.
Log In?

What's my password?
Create A New User
Domain Nodelet?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others wandering the Monastery: (3)
As of 2022-08-08 21:53 GMT
Find Nodes?
    Voting Booth?

    No recent polls found