Beefy Boxes and Bandwidth Generously Provided by pair Networks
Welcome to the Monastery
 
PerlMonks  

Comment on

( #3333=superdoc: print w/ replies, xml ) Need Help??
Hi Monks,

I have been working with XML::Simple for a quite time now and am very happy using it. However as thing have moved on both in perl and my application i need to change to some other perl XML module. The xml that XML::Simple has to process now are consistently getting bigger in size upto 10MB and complexity causing the xml to hash conversion very slow.

Basically what my code does is connect to multiple vendors via web services with SOAP::Lite (another great perl module) and convert the xml to JSON for browser to display. (May be I could have used XML::XML2JSON for this but all vendors have different formats of xml)
XML -> PERL -> JSON
As the results have to be displayed to browser performance is a big issue.

I have been going through CPAN, and off course perlmonks trying to read about possible upgrade.
There are very good discussions on perlmonks and a great tutorial too Stepping up from XML::Simple to XML::LibXML for moving from XML::Simple to XML::LibXML. Also i found from perlmonks that XML::LibXML is best way forward.

As i am looking for a possible switch to avoid complete rewrite of code, i need to have a xml to perl hash structure.
My reasoning to consider XML::Compile is that it can give me a perl hash and also i read from XML::Compile documentation that it is based on XML::LibXML and complies to all xml standard.
Another possibility could be that create a template PERL hash of my format and convert it directly to it.
Haven't figured it out yet if that is possible, it would be great if somebody has idea about it.

As conversion has to be fast I tried to benchmark different perl modules which convert xml to hash structure (XML::LibXML is exception in bechmark just included it to see how fast it parses)
I found great info at Benchmarks of XML Parsers
From my test i found that XML::Compile is slower than XML::Simple.
Please see the code below

use XML::LibXML; use XML::Fast; use XML::Simple; use XML::Bare; use XML::Compile::Schema; use Data::Dumper; $XML::Simple::PREFERRED_PARSER='XML::Parser'; # Found this great tip f +rom perlmonks use Benchmark qw/cmpthese/; $doc='XML String '; my $schema = XML::Compile::Schema->new('./myschema.xsd'); my $reader = $schema->compile(READER => '{myns}mytype'); # I have done this outside to compile schema once but not sure if it w +orks like that cmpthese timethese -10, { libxml => sub { XML::LibXML->new->parse_string($doc) }, xmlfast => sub { XML::Fast::xml2hash($doc) }, xmlbare => sub { XML::Bare->new(text => $doc)->parse }, xmlsimple => sub { XML::Simple->new(ForceArray => 0, KeyAttr => + {})->XMLin($doc); }, xmlcompile => sub {my $hash = $reader->("$doc");}, };

Results on my machine

Rate xmlcompile xmlsimple xmlbare xmlfast libx +ml xmlcompile 51.5/s -- -66% -97% -97% -9 +7% xmlsimple 149/s 190% -- -91% -92% -9 +2% xmlbare 1651/s 3107% 1006% -- -11% -1 +1% xmlfast 1846/s 3487% 1137% 12% -- - +0% libxml 1846/s 3487% 1137% 12% 0% +--

I want to know if i am doing things correctly and is XML::Compile really slow?
Please advice..

Found one more probable alternative.

  • Recommended XML::LibXML can be used for parsing efficiency and standard compliance
  • Convert the individual nodes to hash using XML::Hash::LX for ease of use.
  • Cannot be used as plug in replacement to XML::Simple significant code rewrite required.

Module doco says

use XML::Hash::LX; # Usage with XML::LibXML my $doc = XML::LibXML->new->parse_string($xml); my $xp = XML::LibXML::XPathContext->new($doc); $xp->registerNs('rss', 'http://purl.org/rss/1.0/'); # then process xpath for ($xp->findnodes('//rss:item')) { # and convert to hash concrete nodes my $item = xml2hash($_); print Dumper+$item }

This module is a companion for XML::LibXML.
It operates with LibXML objects, could return or accept LibXML objects, and may be used for easy data transformations

It is faster in parsing then XML::Simple, XML::Hash, XML::Twig and of course much slower than XML::Bare ;)

It is faster in composing than XML::Hash, but slower than XML::Simple

Parse benchmark: Rate Simple Hash Twig Hash::LX Bare Simple 11.3/s -- -2% -16% -44% -97% Hash 11.6/s 2% -- -14% -43% -97% Twig 13.5/s 19% 16% -- -34% -96% Hash::LX 20.3/s 79% 75% 51% -- -95% Bare 370/s 3162% 3088% 2650% 1721% --


In reply to Is it wiser to move on from XML::Simple to XML::Compile by mohan2monks

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • Outside of code tags, you may need to use entities for some characters:
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.
  • Log In?
    Username:
    Password:

    What's my password?
    Create A New User
    Chatterbox?
    and the web crawler heard nothing...

    How do I use this? | Other CB clients
    Other Users?
    Others rifling through the Monastery: (11)
    As of 2014-10-23 21:39 GMT
    Sections?
    Information?
    Find Nodes?
    Leftovers?
      Voting Booth?

      For retirement, I am banking on:










      Results (129 votes), past polls