|Perl: the Markov chain saw|
Re: How to handle encoding for STDERRby andal (Hermit)
|on Feb 11, 2013 at 08:01 UTC||Need Help??|
Well, your question is very hard to understand. What exactly is your problem? Read perllocale. It states
By default, Perl ignores the current locale. The "use locale" pragma tells Perl to use the current locale for some operations
So, unless you say "use locale" your locale settings are ignored by perl program. But they are not ignored by the shell that was used to execute perl program. So, if the shell is configured to receive UTF-8 text from programs, then your perl program should produce it, otherwise you get garbage to see.
Now, you get garbage. First, you should figure out, what is the source for the garbage. You have code 'warn "<UTF-8 string>"', do I assume correctly, that in place of "<UTF-8 string>" you do have some text with UTF-8 characters? Is this string shows up correctly?
If only $! shows up as garbage, have you tried to check if it contains octets or has utf8 flag set? As far as I understand it, binmode configures filehandle to convert all data from internal encoding (marked by presence of utf8 flag) to the sequence of octets in appropriate encoding. So, if the data is already sequence of octets, then the additional conversion will mess up the data.
Personally, I avoid using binmode for setting UTF-8 handling. I just follow simple rule: output only octets in appropriate encoding (normally it is UTF-8). Then I just use Encode::decode or Encode::encode to convert octets to strings as perl understands them, or back from perl strings to octets for output.
If there's 'use utf8', then any strings directly provided in the script will be converted to internal format understandable by perl, so those will have to be converted to octets before they are passed outside of perl program.
I've never seen $! containing non-english text because most of the time the systems I work with don't have anything but English stuff, so I don't know in which form is the text there. But if it is just sequence of octets, then you'll have problem outputting it through file handle expecting perl string and not sequence of octets.