Beefy Boxes and Bandwidth Generously Provided by pair Networks
The stupid question is the question not asked
 
PerlMonks  

Fastest XML Parser ?

by renodino (Curate)
on Feb 16, 2008 at 19:58 UTC ( [id://668343]=perlquestion: print w/replies, xml ) Need Help??

renodino has asked for the wisdom of the Perl Monks concerning the following question:

Are there any benchmarks out there, or general experiences, on relative speeds of the various XML parser packages ? I'm building a server-side mashup and need to convert some XML to JSON. (I tried XML::XML2JSON, but it didn't function).

I'm currently using XML::Simple (just for GOWI's sake), but its taking >1 sec to parse a fairly small (38k) chunk of not terribly complex XML...I gotta believe there's a faster solution.


Perl Contrarian & SQL fanboy

Replies are listed 'Best First'.
Re: Fastest XML Parser ?
by almut (Canon) on Feb 16, 2008 at 21:08 UTC

    XML::Bare seems to be pretty fast. I've used it for interfacing with an XML based API to an Oracle DB, and I must say I was very pleased with its performance (and ease of installation — no external library dependencies).  It's not perfect in every respect, but if your XML is simple and predictable, it might be worth giving it a try...

    (BTW, there's some benchmark results against other modules at the end of the module's POD.)

      Wow, very fast; thnx for the pointer.

      BUT....the output is very densebloated. Every leaf node is a full hash, and it tries to preserve comments and attributes. Which is great if the output is going to be turned back into XML...but I just want the values/attributes keyed by their names, so I can create a reasonably compressed piece of JSON to send to the browser.

      I'm taking a stab at patching it w/ a "compact" option to achieve that and see how it behaves.

      Update:
      Hacked it up to support both a compact mode, and a to_json() method;
      Results:
      XML::Simple + XML::LibXML + JSON::XS : ~0.6 sec
      XML::Bare w/ hacks: 0.04 sec (yes, 40 millisecs)

      I think I've found my favorite XML parser 8^))


      Perl Contrarian & SQL fanboy

        I'm interested in doing to same thing (convert XML to JSON quickly) and your solution looks very interesting.

        Would you mind posting it somewhere for the whole world to enjoy? :-)

        Thanks,
        GFK's

Re: Fastest XML Parser ?
by ajt (Prior) on Feb 16, 2008 at 21:55 UTC

    I would suggest that if you want a XML Parser that is fast you may wish to try the libxml2 based XML::LibXML. It's both fast and very complete, many other fast parsers are not actually complete and will fail on perfectly valid XML.

    See also Re: Xerces XML parser for other parsers and comments about them.


    --
    ajt
Re: Fastest XML Parser ?
by mirod (Canon) on Feb 17, 2008 at 06:26 UTC

    I maintain a series of benchmarks for various cases named Ways to Rome. The short answer is that the fastest parser is XML::LibXML.

    In your case if XML::Simple is that slow, maybe that's because you are using it's default, pure Perl, parser which is really slow. Do you have XML::LibXML or XML::Parser installed? You might also need to set the $XML_SIMPLE_PREFERRED_PARSER (or $XML::Simple::PREFERRED_PARSER) to XML::Parser or XML::LibXML. See the Environment section in the docs of the module.

      FWIW: $PREFERRED_PARSER should be XML::LibXML::SAX; just "XML::LibXML" falls down hard. And it did speed it up (0.4 secs vs. >1 sec). But XML::Simple's docs need a bit more clarity on the subject, cuz I tore out significant chunks of hair trying to find the secret sauce. (Not to mention the agony of setting up LibXML on Windows and fussing w/ environment variables so XML::LibXML could find it..).

      Perl Contrarian & SQL fanboy
        I agree that the docs could be clearer. Setting this variable causes code to die badly if the fast module is not installed.

        Here is code that I use to check if the fast module is installed. If it is installed, the code is happy, and it just uses the fast module. If it is not installed, it quietly falls back to the slow module:

        use English qw($CHILD_ERROR); use XML::Simple; # If a fast parser for XML::Simple is installed, then use it. my $parser = 'XML::LibXML::SAX'; my $cmd = "perl -M$parser -e 1"; my $output = qx($cmd 2>&1); unless ($CHILD_ERROR) { $XML::Simple::PREFERRED_PARSER = $parser; }
Re: Fastest XML Parser ?
by ikegami (Patriarch) on Feb 16, 2008 at 20:15 UTC
    If your JSON is a straight translation from XML (and maybe if it isn't), you probably don't need to build a tree, so you could save yourself some time by using a SAX interface.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://668343]
Approved by ikegami
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others contemplating the Monastery: (10)
As of 2024-03-28 12:04 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found