http://www.perlmonks.org?node_id=979436


in reply to UTF8 with YAML or JSON

I'm mostly just repeating clues already given–

my @in  = <DATA>;
my $in  = join("",@in);
my $dat = Load($in);

print $dat->{x}, $/;
print length($dat->{x}), $/;

__DATA__
x: ă
moo@cow~>perl -MYAML::Syck pm-979143
ă
2
moo@cow~>perl -MYAML::XS pm-979143
Wide character in print at pm-979143 line 7, <DATA> line 1.
ă
1
moo@cow~>perl -CO -MYAML::XS pm-979143
ă
1

You can see that only the YAML::XS version is doing UTF-8. YAML::Syck's documentation listed it as deprecated until somewhat recently when it picked up a new maintainer. And JSON(::XS) is also a fine, maybe better, choice. Neither lets you off the hook for knowing what bytes v chars are in play.