Beefy Boxes and Bandwidth Generously Provided by pair Networks
Think about Loose Coupling
 
PerlMonks  

Fastest XML Parser ?

by renodino (Curate)
on Feb 16, 2008 at 19:58 UTC ( #668343=perlquestion: print w/ replies, xml ) Need Help??
renodino has asked for the wisdom of the Perl Monks concerning the following question:

Are there any benchmarks out there, or general experiences, on relative speeds of the various XML parser packages ? I'm building a server-side mashup and need to convert some XML to JSON. (I tried XML::XML2JSON, but it didn't function).

I'm currently using XML::Simple (just for GOWI's sake), but its taking >1 sec to parse a fairly small (38k) chunk of not terribly complex XML...I gotta believe there's a faster solution.


Perl Contrarian & SQL fanboy

Comment on Fastest XML Parser ?
Re: Fastest XML Parser ?
by ikegami (Pope) on Feb 16, 2008 at 20:15 UTC
    If your JSON is a straight translation from XML (and maybe if it isn't), you probably don't need to build a tree, so you could save yourself some time by using a SAX interface.
Re: Fastest XML Parser ?
by almut (Canon) on Feb 16, 2008 at 21:08 UTC

    XML::Bare seems to be pretty fast. I've used it for interfacing with an XML based API to an Oracle DB, and I must say I was very pleased with its performance (and ease of installation — no external library dependencies).  It's not perfect in every respect, but if your XML is simple and predictable, it might be worth giving it a try...

    (BTW, there's some benchmark results against other modules at the end of the module's POD.)

      Wow, very fast; thnx for the pointer.

      BUT....the output is very densebloated. Every leaf node is a full hash, and it tries to preserve comments and attributes. Which is great if the output is going to be turned back into XML...but I just want the values/attributes keyed by their names, so I can create a reasonably compressed piece of JSON to send to the browser.

      I'm taking a stab at patching it w/ a "compact" option to achieve that and see how it behaves.

      Update:
      Hacked it up to support both a compact mode, and a to_json() method;
      Results:
      XML::Simple + XML::LibXML + JSON::XS : ~0.6 sec
      XML::Bare w/ hacks: 0.04 sec (yes, 40 millisecs)

      I think I've found my favorite XML parser 8^))


      Perl Contrarian & SQL fanboy

        I'm interested in doing to same thing (convert XML to JSON quickly) and your solution looks very interesting.

        Would you mind posting it somewhere for the whole world to enjoy? :-)

        Thanks,
        GFK's

Re: Fastest XML Parser ?
by ajt (Prior) on Feb 16, 2008 at 21:55 UTC

    I would suggest that if you want a XML Parser that is fast you may wish to try the libxml2 based XML::LibXML. It's both fast and very complete, many other fast parsers are not actually complete and will fail on perfectly valid XML.

    See also Re: Xerces XML parser for other parsers and comments about them.


    --
    ajt
Re: Fastest XML Parser ?
by mirod (Canon) on Feb 17, 2008 at 06:26 UTC

    I maintain a series of benchmarks for various cases named Ways to Rome. The short answer is that the fastest parser is XML::LibXML.

    In your case if XML::Simple is that slow, maybe that's because you are using it's default, pure Perl, parser which is really slow. Do you have XML::LibXML or XML::Parser installed? You might also need to set the $XML_SIMPLE_PREFERRED_PARSER (or $XML::Simple::PREFERRED_PARSER) to XML::Parser or XML::LibXML. See the Environment section in the docs of the module.

      FWIW: $PREFERRED_PARSER should be XML::LibXML::SAX; just "XML::LibXML" falls down hard. And it did speed it up (0.4 secs vs. >1 sec). But XML::Simple's docs need a bit more clarity on the subject, cuz I tore out significant chunks of hair trying to find the secret sauce. (Not to mention the agony of setting up LibXML on Windows and fussing w/ environment variables so XML::LibXML could find it..).

      Perl Contrarian & SQL fanboy

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://668343]
Approved by ikegami
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others drinking their drinks and smoking their pipes about the Monastery: (5)
As of 2014-07-13 17:42 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    When choosing user names for websites, I prefer to use:








    Results (251 votes), past polls