Hi Monks,
I have been working with XML::Simple for a quite time now and am very happy using it.
However as thing have moved on both in perl and my application i need to change to some other perl XML module.
The xml that XML::Simple has to process now are consistently getting bigger in size upto 10MB and complexity causing the xml to hash conversion very slow.
Basically what my code does is connect to multiple vendors via web services with SOAP::Lite (another great perl module) and convert the xml to JSON for browser to display. (May be I could have used XML::XML2JSON for this but all vendors have different formats of xml)
XML -> PERL -> JSON
As the results have to be displayed to browser performance is a big issue.
I have been going through CPAN, and off course perlmonks trying to read about possible upgrade.
There are very good discussions on perlmonks and a great tutorial too Stepping up from XML::Simple to XML::LibXML for moving from XML::Simple to XML::LibXML.
Also i found from perlmonks that XML::LibXML is best way forward.
As i am looking for a possible switch to avoid complete rewrite of code, i need to have a xml to perl hash structure.
My reasoning to consider XML::Compile is that it can give me a perl hash and also i read from XML::Compile documentation that it is based on XML::LibXML and complies to all xml standard.
Another possibility could be that create a template PERL hash of my format and convert it directly to it.
Haven't figured it out yet if that is possible, it would be great if somebody has idea about it.
As conversion has to be fast I tried to benchmark different perl modules which convert xml to hash structure (XML::LibXML is exception in bechmark just included it to see how fast it parses)
I found great info at Benchmarks of XML Parsers
From my test i found that XML::Compile is slower than XML::Simple.
Please see the code below
use XML::LibXML;
use XML::Fast;
use XML::Simple;
use XML::Bare;
use XML::Compile::Schema;
use Data::Dumper;
$XML::Simple::PREFERRED_PARSER='XML::Parser'; # Found this great tip f
+rom perlmonks
use Benchmark qw/cmpthese/;
$doc='XML String ';
my $schema = XML::Compile::Schema->new('./myschema.xsd');
my $reader = $schema->compile(READER => '{myns}mytype');
# I have done this outside to compile schema once but not sure if it w
+orks like that
cmpthese timethese -10, {
libxml => sub { XML::LibXML->new->parse_string($doc) },
xmlfast => sub { XML::Fast::xml2hash($doc) },
xmlbare => sub { XML::Bare->new(text => $doc)->parse },
xmlsimple => sub { XML::Simple->new(ForceArray => 0, KeyAttr =>
+ {})->XMLin($doc); },
xmlcompile => sub {my $hash = $reader->("$doc");},
};
Results on my machine
Rate xmlcompile xmlsimple xmlbare xmlfast libx
+ml
xmlcompile 51.5/s -- -66% -97% -97% -9
+7%
xmlsimple 149/s 190% -- -91% -92% -9
+2%
xmlbare 1651/s 3107% 1006% -- -11% -1
+1%
xmlfast 1846/s 3487% 1137% 12% -- -
+0%
libxml 1846/s 3487% 1137% 12% 0%
+--
I want to know if i am doing things correctly and is XML::Compile really slow?
Please advice..
Found one more probable alternative.
- Recommended XML::LibXML can be used for parsing efficiency and standard compliance
- Convert the individual nodes to hash using XML::Hash::LX for ease of use.
-
Cannot be used as plug in replacement to XML::Simple significant code rewrite required.
Module doco says
use XML::Hash::LX;
# Usage with XML::LibXML
my $doc = XML::LibXML->new->parse_string($xml);
my $xp = XML::LibXML::XPathContext->new($doc);
$xp->registerNs('rss', 'http://purl.org/rss/1.0/');
# then process xpath
for ($xp->findnodes('//rss:item')) {
# and convert to hash concrete nodes
my $item = xml2hash($_);
print Dumper+$item
}
This module is a companion for XML::LibXML.
It operates with LibXML objects, could return or accept LibXML objects, and may be used for easy data transformations
It is faster in parsing then XML::Simple, XML::Hash, XML::Twig and of course much slower than XML::Bare ;)
It is faster in composing than XML::Hash, but slower than XML::Simple
Parse benchmark:
Rate Simple Hash Twig Hash::LX Bare
Simple 11.3/s -- -2% -16% -44% -97%
Hash 11.6/s 2% -- -14% -43% -97%
Twig 13.5/s 19% 16% -- -34% -96%
Hash::LX 20.3/s 79% 75% 51% -- -95%
Bare 370/s 3162% 3088% 2650% 1721% --