Beefy Boxes and Bandwidth Generously Provided by pair Networks
P is for Practical
 
PerlMonks  

Re^4: UTF-8 and XML::Parser

by Anonymous Monk
on Oct 14, 2012 at 05:49 UTC ( #998922=note: print w/ replies, xml ) Need Help??


in reply to Re^3: UTF-8 and XML::Parser
in thread UTF-8 and XML::Parser

i benchmarked the binmode variant against the utf8 open variant down here. i made an xml file with 100 lines and 32000 's (utf8) in each line ((P)CDATA). the below script did it in 0.20 seconds while the 'use utf8; / binmode' method take about 17.5 seconds.

unfortunately perl crashes when i give a filehande to the parser while using the 'use open qw/:std :utf8/;' method when the file gets big. the 'use utf8; / binmode' method takes about 35 seconds when i pass the filehandle to the parser.

output got redirected to /dev/null

#!/usr/bin/perl use XML::Parser; #use utf8; use open qw/:std :utf8/; $ch = sub { my ($p, $w) = @_; # binmode STDOUT, ":encoding(UTF-8)"; print "$w\n"; }; $p = XML::Parser->new(ProtocolEncoding => 'UTF-8'); $p->setHandlers('Char' => $ch); my $xml = ""; open(F, '< x.xml'); while(<F>) { $xml .= $_; } $p->parse($xml); #$p->parse(*F); close(F);


Comment on Re^4: UTF-8 and XML::Parser
Download Code
Re^5: UTF-8 and XML::Parser
by remiah (Hermit) on Oct 14, 2012 at 06:33 UTC

    parsefile() has some trouble?

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://998922]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others scrutinizing the Monastery: (11)
As of 2015-07-07 07:49 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    The top three priorities of my open tasks are (in descending order of likelihood to be worked on) ...









    Results (87 votes), past polls