Beefy Boxes and Bandwidth Generously Provided by pair Networks
Problems? Is your data what you think it is?
 
PerlMonks  

Data Dumper utf8 utf-8 unicode

by smartnut007 (Initiate)
on Apr 23, 2009 at 00:24 UTC ( #759457=perlquestion: print w/ replies, xml ) Need Help??
smartnut007 has asked for the wisdom of the Perl Monks concerning the following question:

use utf8;
use Data::Dumper;
use Encode;

open(my $f ,">:encoding(utf-8)", "C:/tmp/utf8.txt");
binmode $f,':encoding(utf-8)';
print $f "Log Started \n ";
END{ close($f); };

my $str = '原來' print $f $str."\n";
print $f Dumper( $str );
print "\n";
print $f decode_utf8( Dumper( encode_utf8( $str ) );

The output is

Log Started
原 來
$VAR1 = [ "\x{539f}\x{4f86}" ];
$VAR1 = '原來';

My question is how do i get Data::Dumper to handle utf8 strings and not produce ISO-latin-1 encoded utf-8 characters like x{....} ps: forgive my formatting . Cant get perlmonks.org to handle utf-8 very well either :-)

Comment on Data Dumper utf8 utf-8 unicode
Download Code
Re: Data Dumper utf8 utf-8 unicode
by almut (Canon) on Apr 23, 2009 at 01:09 UTC

    I think that behavior is hardcoded/not configurable, but you could try some ugly hack like this (i.e. redefine the qquote routine):

    use Data::Dumper; $Data::Dumper::Useqq = 1; { no warnings 'redefine'; sub Data::Dumper::qquote { my $s = shift; return "'$s'"; } } my $str = "abc \x{539f}\x{4f86} xyz"; binmode STDOUT, ':encoding(UTF-8)'; print Dumper $str;

    Output:
    $VAR1 = 'abc 原來 xyz';

    (untested for side-effects!)

    And if that's for more than just dumping stuff to look at, in other words, if you want to be able to eval the output as would normally be possible, you'd at least have to take care of properly quoting any single quotes and backslashes within those single quoted strings...

Re: Data Dumper utf8 utf-8 unicode
by Juerd (Abbot) on Jun 21, 2009 at 23:35 UTC

    My question is how do i get Data::Dumper to handle utf8 strings and not produce ISO-latin-1 encoded utf-8 characters like x{....}

    It can't do that; Data::Dumper creates those \x{...} escapes unconditionally. You could instead hack Data::Dumper as already suggested by almut, or look for a replacement for Data::Dumper. As there are several, I'm convinced that at least one of them will do what you want.

Re: Data Dumper utf8 utf-8 unicode
by ikegami (Pope) on Jun 22, 2009 at 04:17 UTC

    Cant get perlmonks.org to handle utf-8 very well either :-)

    Not at all, actually. PerlMonks pages are encoded using iso-latin-1. As such, all data submitted must be iso-latin-1 encoded. Characters that aren't in iso-latin-1 can't be submitted.

    As a form of wishful thinking, your browser submits HTML/XML escapes for characters that can't be submitted. This doesn't work well when HTML isn't expected (inside of <code></code> tags).

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://759457]
Approved by AnomalousMonk
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others surveying the Monastery: (13)
As of 2014-07-10 15:26 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    When choosing user names for websites, I prefer to use:








    Results (213 votes), past polls