Beefy Boxes and Bandwidth Generously Provided by pair Networks
Problems? Is your data what you think it is?

Re^4: UTF-8 and XML::Parser

by Anonymous Monk
on Oct 14, 2012 at 05:49 UTC ( #998922=note: print w/replies, xml ) Need Help??

in reply to Re^3: UTF-8 and XML::Parser
in thread UTF-8 and XML::Parser

i benchmarked the binmode variant against the utf8 open variant down here. i made an xml file with 100 lines and 32000 's (utf8) in each line ((P)CDATA). the below script did it in 0.20 seconds while the 'use utf8; / binmode' method take about 17.5 seconds.

unfortunately perl crashes when i give a filehande to the parser while using the 'use open qw/:std :utf8/;' method when the file gets big. the 'use utf8; / binmode' method takes about 35 seconds when i pass the filehandle to the parser.

output got redirected to /dev/null

#!/usr/bin/perl use XML::Parser; #use utf8; use open qw/:std :utf8/; $ch = sub { my ($p, $w) = @_; # binmode STDOUT, ":encoding(UTF-8)"; print "$w\n"; }; $p = XML::Parser->new(ProtocolEncoding => 'UTF-8'); $p->setHandlers('Char' => $ch); my $xml = ""; open(F, '< x.xml'); while(<F>) { $xml .= $_; } $p->parse($xml); #$p->parse(*F); close(F);

Replies are listed 'Best First'.
Re^5: UTF-8 and XML::Parser
by remiah (Hermit) on Oct 14, 2012 at 06:33 UTC

    parsefile() has some trouble?

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://998922]
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others surveying the Monastery: (7)
As of 2021-06-22 22:53 GMT
Find Nodes?
    Voting Booth?
    What does the "s" stand for in "perls"? (Whence perls)

    Results (110 votes). Check out past polls.