Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl: the Markov chain saw
 
PerlMonks  

Re^3: Why Doesn't Text::CSV_XS Print Valid UTF-8 Text When Used With the open Pragma? ("XS")

by remiah (Hermit)
on Oct 02, 2011 at 07:49 UTC ( #929118=note: print w/ replies, xml ) Need Help??


in reply to Re^2: Why Doesn't Text::CSV_XS Print Valid UTF-8 Text When Used With the open Pragma? ("XS")
in thread Why Doesn't Text::CSV_XS Print Valid UTF-8 Text When Used With the open Pragma?

Your unexpected output seems ISO-8859-1 output of the SPADE charcters. Probably, If you put the output to the text, and See the results in your browser with utf-8 encoding, You'see the SPADE.

print qq("BLACK SPADE SUIT","BLACK HEART SUIT","BLACK DIAMOND SUIT","B +LACK CLUB SUIT",\n); #decimail unicode character for above; my @ary=("♠","♥","♦","♣"); foreach my $target(@ary) { $target =~ s/\&#(.*);/$1/; print '"' . encode('utf8', chr($target)) . '",'; } print "\n";
I mean , this is terminal problem , doesn't it ?


Comment on Re^3: Why Doesn't Text::CSV_XS Print Valid UTF-8 Text When Used With the open Pragma? ("XS")
Download Code
Re^4: Why Doesn't Text::CSV_XS Print Valid UTF-8 Text When Used With the open Pragma? ("XS")
by Anonymous Monk on Oct 02, 2011 at 08:33 UTC

    I mean , this is terminal problem , doesn't it ?

    No. It looks similar but no. The problem, in a nutshell, if you use warn "$ARGV $_ " for PerlIO::get_layers(*ARGV) you can see ARGV doesn't get utf8 io layer, only STDIN gets them

    $ perl ... utf8wobom.csv >bad utf8wobom.csv unix at ... utf8wobom.csv crlf at ... $ perl ... < utf8wobom.csv >good - unix at ... - crlf at ... - encoding(utf-8-strict) at ... - utf8 at ...

    In my non-utf terminal it shows

    $ ls -loanh good bad -rw-rw-rw- 1 0 115 2011-10-02 01:31 bad -rw-rw-rw- 1 0 103 2011-10-02 01:31 good $ diff good bad 2c2 < "Γ","Γ","Γ֪","Γ" --- > "├┬┬","├┬┬","├┬┬","├┬┬"
      Probably, Text::CSV::Encoded is what you are looking for.
      use Text::CSV::Encoded; my $csv = Text::CSV::Encoded->new ({binary=>1, encoding=>"utf8"}) or d +ie $!; while (my $row = $csv->getline (*ARGV)) { $csv->print(\*STDOUT, $row); }
      This works fine with my perl 5.12.2, with command line ...
      perl test.pl test.csv

        Text::CSV::Encoded is indeed a nice extension to the Text::CSV parser family, but it does solve the underlying problem as stated in the original quest. use encoding ... is (or now was) broken in combination with XS code. Most bugs that stem from that are workaround-able by using other encode/decode approaches, but it is (or was) still a bug.


        Enjoy, Have FUN! H.Merijn

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://929118]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others rifling through the Monastery: (10)
As of 2014-12-22 23:35 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    Is guessing a good strategy for surviving in the IT business?





    Results (133 votes), past polls