How to handle encoding for STDERR

na has asked for the wisdom of the Perl Monks concerning the following question:

I have a code like this

#!/usr/bin/perl
use warnings;
use strict;
use utf8;

binmode( STDERR, ':utf8' );
$ENV{ LANG } = 'C';
warn "<UTF-8 string>";
open( my $in, '<', 'non-existing-file' ) || die $!;
[download]

Because "$!" is just a byte string, output is terrible. My questions are:

1) Are there any way to enfoce '$!' to 'C' locale. "$ENV{ LANG } = 'C';" does,'t works.

2) Are there any good solution? Some my poor workarounds are:

Don't use 'use utf8'
on't use "binmode( STDERR, ':utf8' );" and live with "Wide character in warn" messages.
Decode '$!'. and don't use any modules which uses 'die $!'
use __DIE__/__WARN__ signal-handlers.

My perl version: v5.14.2

Comment on How to handle encoding for STDERR Download Code

Replies are listed 'Best First'.
Re: How to handle encoding for STDERR by andal (Hermit) on Feb 11, 2013 at 08:01 UTC
Well, your question is very hard to understand. What exactly is your problem? Read perllocale. It states By default, Perl ignores the current locale. The "use locale" pragma tells Perl to use the current locale for some operations So, unless you say "use locale" your locale settings are ignored by perl program. But they are not ignored by the shell that was used to execute perl program. So, if the shell is configured to receive UTF-8 text from programs, then your perl program should produce it, otherwise you get garbage to see. Now, you get garbage. First, you should figure out, what is the source for the garbage. You have code 'warn "<UTF-8 string>"', do I assume correctly, that in place of "<UTF-8 string>" you do have some text with UTF-8 characters? Is this string shows up correctly? If only $! shows up as garbage, have you tried to check if it contains octets or has utf8 flag set? As far as I understand it, binmode configures filehandle to convert all data from internal encoding (marked by presence of utf8 flag) to the sequence of octets in appropriate encoding. So, if the data is already sequence of octets, then the additional conversion will mess up the data. Personally, I avoid using binmode for setting UTF-8 handling. I just follow simple rule: output only octets in appropriate encoding (normally it is UTF-8). Then I just use Encode::decode or Encode::encode to convert octets to strings as perl understands them, or back from perl strings to octets for output. If there's 'use utf8', then any strings directly provided in the script will be converted to internal format understandable by perl, so those will have to be converted to octets before they are passed outside of perl program. I've never seen $! containing non-english text because most of the time the systems I work with don't have anything but English stuff, so I don't know in which form is the text there. But if it is just sequence of octets, then you'll have problem outputting it through file handle expecting perl string and not sequence of octets.	[reply]
Re^2: How to handle encoding for STDERR by na (Novice) on Feb 12, 2013 at 13:52 UTC
Sorry for poor writing. Because correct encoded string in my locale may looks like 'garbage' for most of you, I avoid to cut & paste exact script and output. As you guess, "<UTF-8 string>" is a utf-8 encoded string in Japanese and output fine( because of combination of "use 'utf8'" and "binmode(STDERR , ':utf8')." It may depend OS and locale, but for "ja_JP.UTF-8" locate on Ubuntu case, Perl generate language specific error messages. Half of may question was ans wed by Anonymous Monk. I just want to know how to get error string in C-locale even if Perl-process start in non-C locale.	[reply]
Re: How to handle encoding for STDERR by Anonymous Monk on Feb 11, 2013 at 05:05 UTC
Because "$!" is just a byte string, output is terrible. What does that mean? I imagine you could you'd use POSIX/or locale for $ENV{LANG}='C' to have effect at runtime	[reply]
Re^2: How to handle encoding for STDERR by Anonymous Monk on Feb 11, 2013 at 05:10 UTC
https://metacpan.org/search?q=locale%20errno perllocale POSIX::Wide -- POSIX functions returning wide-char strings	[reply]
Re^3: How to handle encoding for STDERR by na (Novice) on Feb 12, 2013 at 13:08 UTC
Thank you for the information. I shold search by myself first!	[reply]
Re^4: How to handle encoding for STDERR by Anonymous Monk on Feb 12, 2013 at 13:21 UTC
Re: How to handle encoding for STDERR by ww (Archbishop) on Feb 11, 2013 at 13:51 UTC
Whatever other problems may exist, your open (Ln 9) comes up a little short of a dozen; a few degrees off plumb; or, more directly, wrong: Your `open( my $in, '<', 'non-existing-file' ) \|\| die $!;` should be `open( my $in, '<', 'path-to-file-name' ) or die "Can't open path-to-f +ile-name, $!";` [download] (with a [now deprecated, at least by some] bareword and variant quote symbols) `my $datafile; open (OUT, ">", $datafile ) or die "Can't open $datafile for write, $! +";` [download] # for a write	[reply] [d/l] [select]
Re: How to handle encoding for STDERR by Anonymous Monk on Feb 11, 2013 at 13:07 UTC
I am suspicious of a bug, or possibly of bad-data from wherever this stuff is coming from. Don't try to "code around it" ... find the true root cause of what's going on here. Red flags abound, demanding explanation.	[reply]

Back to Seekers of Perl Wisdom