in reply to
XML::Twig and threads [solved]
Hello, grizzley.
I have little experience for huge XML files, so I take ready made 100MB xml sample file for example.
Does your colleague have free memory while his process? XML::Twig will eat up memory for large XML files without "purge" or "flush".
Bellow is my test script, counting text tag in two ways.
use strict; use warnings; use XML::Twig;
use Time::HiRes;
my $cnt1=0;
my $b1=Time::HiRes::time();
XML::Twig->new(
twig_roots => {
'text' => sub{ $cnt1++; $_[0]->purge;},
},
)->parsefile("standard");
my $e1=Time::HiRes::time();
my $cnt2=0;
my $b2=Time::HiRes::time();
XML::Twig->new(
twig_roots =>{
'/site/regions/africa//text' => sub{$cnt2++;},
},
)->parsefile("standard");
my $e2=Time::HiRes::time();
print "1. text count=$cnt1, time=".($e1-$b1)."\n";
print "2. text count=$cnt2, time=".($e2-$b2)."\n";
__DATA__
1. text count=105114, time=111.188741922379
2. text count=1657, time=60.9104990959167
When I forget to purge(), first example eated up my memory and coredumped. Sometimes, purge() needs some care because it purges inner most element (
XML Newbie 's example of Twig has some relation to it).
And if you can squeeze the range with xpath like expression, it could become faster.
I agree with other monks opinions ...
regards.