Beefy Boxes and Bandwidth Generously Provided by pair Networks
Just another Perl shrine

comment on

( #3333=superdoc: print w/replies, xml ) Need Help??

Hi Jenda

I've done stuff like this very often.

When I see this, I see a generic state maintainence mechanism for SAX.

SAX is great. It is sometimes the only option when you have documents that are too big to fit in to RAM. But because it provides nothing more than a dispatch mechanism for document particles, maintaining state between the different callbacks can get tricky, boring, and error prone.

As an example, refer to Kip Hampton's excellent article High-Performance XML Parsing With SAX.

In it, he uses an XML document that represents a series of emails that need sent as the sample data. He uses SAX to build up an argument list for Mail::Sendmail. When the end_element callback is fired for the record, he has Mail::Sendmail send the email.

The relevant part here is note how much code he has to write to extract the data from the individual record.

According to the POD in your module:

Or you could view it as yet another event based XML parser that differs from all the others only in two things. First that it only let's you hook your callbacks to the closing tags. And that stores the data for you so that you do not have to use globals or closures and wonder where to attach the snippet of data you just received onto the structure you are building

what you have would help quite a bit in maintaining the state of an XML record.

If you dont mind, what would the rules for your module look like to perform the same task Hampton did with SAX?

Your module would be very useful implemented as a SAX handler so users could take advantage of the many features of SAX (swappable parsers, chained filters/handlers, document writers, standardized interface). Imagine the process you describe aboive as a web serice. You could use a SAX writer in the same pipeline to build up the response for the client request.

Regardless of how you proceed, I'll definitely keep an eye on it. I know I'll use it.


In reply to Re: (RFC) XML::Rules - yet another XML parser by trwww
in thread (RFC) XML::Rules - yet another XML parser by Jenda

Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":

  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.
  • Log In?

    What's my password?
    Create A New User
    and the web crawler heard nothing...

    How do I use this? | Other CB clients
    Other Users?
    Others studying the Monastery: (3)
    As of 2019-01-16 00:02 GMT
    Find Nodes?
      Voting Booth?
      After Perl5, I'm mostly interested in:

      Results (270 votes). Check out past polls.