http://www.perlmonks.org?node_id=11104502


in reply to Re^6: Text::CSV encoding parse()
in thread Text::CSV encoding parse()

Try taking the following code and replacing my $data with your code that fetches the data from the database (as short as possible), and post both your code and the output here. <update> Note that PerlMonks does not handle Unicode inside of <code> tags well, so tell us if you've got any Unicode in there. </update>

#!/usr/bin/env perl
use warnings;
use 5.012;
use utf8; # Perl script file is encoded as UTF-8
use open qw/:utf8 :std/; # reopen STDIN/OUT/ERR as UTF-8
use Text::CSV;
use CGI qw/escapeHTML/;
use CGI::Carp qw/fatalsToBrowser warningsToBrowser/; # for debug ONLY!
use Data::Dumper;
$Data::Dumper::Useqq=1;

my $cgi = CGI->new;
print $cgi->header(-charset=>'UTF-8');
print $cgi->start_html(-title=>'Example', -encoding=>'UTF-8');
warningsToBrowser(1);

my $data = "Euro symbol:  | I \N{U+2764}\N{U+FE0F} \N{U+1F42A}";

print $cgi->pre(escapeHTML( Dumper( $data ) # debugging
	."UTF-8 flag is ".( utf8::is_utf8( $data )?'on':'off' ) ));

print $cgi->p(escapeHTML( $data ));

my $csv = Text::CSV->new ({ binary => 1, sep_char => "|" });
$csv->parse($data);
my ($c1,$c2) = $csv->fields;
print $cgi->p(escapeHTML( "After Text::CSV: ".$c1." | ".$c2 ));

print $cgi->end_html;

(Note: It is better to use utf8::is_utf8() only for debugging.) The output you should see in the browser from the above:

$VAR1 = "Euro symbol: \x{20ac} | I \x{2764}\x{fe0f} \x{1f42a}";
UTF-8 flag is on

Euro symbol: | I ❤️ 🐪

After Text::CSV: Euro symbol: | I ❤️ 🐪