Beefy Boxes and Bandwidth Generously Provided by pair Networks Bob
Just another Perl shrine
 
PerlMonks  

Re^4: UTF-8 and XML::Parser

by Anonymous Monk
on Oct 14, 2012 at 05:49 UTC ( #998922=note: print w/ replies, xml ) Need Help??


in reply to Re^3: UTF-8 and XML::Parser
in thread UTF-8 and XML::Parser

i benchmarked the binmode variant against the utf8 open variant down here. i made an xml file with 100 lines and 32000 's (utf8) in each line ((P)CDATA). the below script did it in 0.20 seconds while the 'use utf8; / binmode' method take about 17.5 seconds.

unfortunately perl crashes when i give a filehande to the parser while using the 'use open qw/:std :utf8/;' method when the file gets big. the 'use utf8; / binmode' method takes about 35 seconds when i pass the filehandle to the parser.

output got redirected to /dev/null

#!/usr/bin/perl use XML::Parser; #use utf8; use open qw/:std :utf8/; $ch = sub { my ($p, $w) = @_; # binmode STDOUT, ":encoding(UTF-8)"; print "$w\n"; }; $p = XML::Parser->new(ProtocolEncoding => 'UTF-8'); $p->setHandlers('Char' => $ch); my $xml = ""; open(F, '< x.xml'); while(<F>) { $xml .= $_; } $p->parse($xml); #$p->parse(*F); close(F);


Comment on Re^4: UTF-8 and XML::Parser
Download Code
Re^5: UTF-8 and XML::Parser
by remiah (Hermit) on Oct 14, 2012 at 06:33 UTC

    parsefile() has some trouble?

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://998922]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others avoiding work at the Monastery: (14)
As of 2014-04-16 17:13 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    April first is:







    Results (433 votes), past polls