Re^2: JSON::XS and unicode

by remiah (Hermit)
on Sep 08, 2012 at 22:45 UTC

in reply to Re: JSON::XS and unicode
in thread JSON::XS and unicode

There seems to be 2 problems.

One is JSON::XS expects 'encoded utf8' string as default, as you point out.
Second is utf8::all doesn't affect Slurp's io layer. When OP output to file and read it with Slurp's read_file, it is 'encoded utf8', not 'decoded utf8'. So the second example seems to succeed at a glance.

use strict; use warnings; use JSON::XS qw( decode_json ); use Data::Dumper; binmode(STDOUT,":encoding(UTF-8)"); sub _p{return pack('U',$_[0])}; my ($wl,$pattern_list); #create utf8 decoded(perl internal utf8) JSON character. $wl = '{"creche": "cr'._p(0xE8).'che",'; $wl.= '"'._p(0xA5).'" : "'._p(0xA3).'",'; $wl.= '"'._p(8353).'": "'._p(1074)._p(1086)._p(1083)._p(1085).'"'; $wl.= '}'; #example 1 of OP sub ex1 { my $pattern_list; #$pattern_list = decode_json($wl); #Wide character in subroutine e +ntry #$pattern_list = JSON::XS->new->utf8(1)->decode($wl);#Wide charact +er in subroutine ent #no warning: it seems this module expects encoded utf8 but decoded + utf8 by default $pattern_list = JSON::XS->new->utf8(0)->decode($wl); } #ex1(); #print Dumper $pattern_list; #example 2 sub ex2 { use File::Slurp qw( read_file ); use utf8::all; open my $fh, '>:encoding(UTF-8)', 'test_file2'; print {$fh} $wl; close $fh; #here utf8::all failed to set Slurp's io layer my $buffer= read_file('test_file2'); print utf8::is_utf8($buffer) ? "buffer:utf8 flagged\n" : "buffer:n +ot utf8 flagged\n"; #you get 'encoded utf8 bytes and that is default for JSON::XS $pattern_list = decode_json( $buffer); #pattern_list is encoded utf8 string, not decoded print utf8::is_utf8($pattern_list) ? "pattern:utf8 flagged\n" : "p +attern:not utf8 flag } ex2(); print Dumper $pattern_list;
JSON::XS's utf8 seems to me very different from other modules like DBD:: modules, Template's binmode option.

