Beefy Boxes and Bandwidth Generously Provided by pair Networks
"be consistent"
 
PerlMonks  

Re: Processing spreadsheet with some cells in ASCII, other cells in UTF-8

by stevieb (Canon)
on Sep 01, 2015 at 20:57 UTC ( [id://1140716]=note: print w/replies, xml ) Need Help??


in reply to Processing spreadsheet with some cells in ASCII, other cells in UTF-8

This is my first ever attempt at non-ascii processing, so I'll let the more experienced Monks criticize if this is the wrong approach, or if there's a better one.

After a very quick dig online, I found that setting binmode on all the file handles can fix the issue:

Input file:

$ cat in.txt }cýæu}*]…‘¦å hello ›ÇÁ

Code:

use warnings; use strict; open my $fh, '<', 'in.txt' or die $!; open my $wfh, '>', 'out.txt' or die $!; binmode $fh, ":utf8"; binmode $wfh, ":utf8"; binmode STDOUT, ":utf8"; while (<$fh>){ chomp; print $wfh "file: $_\n"; print "stdout: $_\n"; }

Output:

# output file file: }cýæu}*]…‘¦å file: hello ›ÇÁ # stdout stdout: }cýæu}*]…‘¦å stdout: hello ›ÇÁ

Replies are listed 'Best First'.
Re^2: Processing spreadsheet with some cells in ASCII, other cells in UTF-8
by Amphiaraus (Beadle) on Sep 02, 2015 at 19:31 UTC

    The problem in our team's Perl script was fixed simply by adding a one-line change at the top of the script:

    binmode STDOUT, ':utf8'; #This handles the multiple encoding for language menus in the Perl IO layers.

    The above one-line change fixed the problem, the Perl script can now process contents of Excel spreadsheet cells, no matter what type of character encoding was used in them.

    No additional low-level function calls, to decode or encode, were needed.

    This change was taken from one of the replies to my original question. Thanks for your help.

    The various web pages found in the replies, which discussed character encoding, were also very helpful.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://1140716]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others chilling in the Monastery: (2)
As of 2024-04-26 00:49 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found