Beefy Boxes and Bandwidth Generously Provided by pair Networks
Welcome to the Monastery
 
PerlMonks  

Text::CSV encoding parse()

by slugger415 (Monk)
on Aug 13, 2019 at 18:00 UTC ( #11104403=perlquestion: print w/replies, xml ) Need Help??

slugger415 has asked for the wisdom of the Perl Monks concerning the following question:

Hello esteemed monks, I am using Text::CSV to parse an array of text strings (pipe delimited) and want to use UTF-8 encoding to read the strings. In the doc at https://metacpan.org/pod/Text::CSV#new I see this instruction:

On parsing (both for "getline" and "parse"), if the source is marked being UTF8, then all fields that are marked binary will also be marked UTF8.

I have set my 'new' instance to binary, and it mostly works, except some accented characters are showing up in the resulting web page as black diamond question marks, e.g. conexi�n. (Japanese and other language characters look fine.) Is there something else I need to set? If I don't use Text::CSV and just 'split' the strings, those characters look fine, and correct.

my $csv = Text::CSV->new ({ binary => 1, sep_char => "|" }); foreach my $row (@sorted_urls){ $csv->parse($row); # processing }

Thank you.

Replies are listed 'Best First'.
Re: Text::CSV encoding parse()
by haukex (Chancellor) on Aug 13, 2019 at 18:05 UTC
    some accented characters are showing up in the resulting web page as black diamond question marks

    Are you sure you've also set your output filehandles to the correct encoding, and have specified that encoding in the HTML? Please provide a Short, Self-Contained, Correct Example.

    To debug the input end of the process, see my suggestions at Re: Parsing Problems.

      Hi, yes I'm using the CGI module and have it properly set:

      print $q->header(-charset    => 'utf-8');

      And as mentioned if I don't use Text::CVS the characters display correctly.

        Hi, yes I'm using the CGI module and have it properly set: print $q->header(-charset => 'utf-8'); And as mentioned if I don't use Text::CVS the characters display correctly.

        Ok, but I'm sorry, there still isn't enough information to answer your question - have another look at my reply above, plus the links therein.

        That means that you are declaring to the browser that your output is UTF-8. Is it actually UTF-8?

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://11104403]
Approved by haukex
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others drinking their drinks and smoking their pipes about the Monastery: (3)
As of 2019-09-15 14:44 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    The room is dark, and your next move is ...












    Results (181 votes). Check out past polls.

    Notices?