This is clearly not a Perl problem (or at least I don't think so), but I can't think of a better place to get an understanding of what is happening.
If I run a bash shell using xterm -u8 (-u8 turns on utf8 mode for xterm), the following line in a Perl script appears to make a script die without even executing the end blocks:
#print first letter of Hebrew alphabet (aleph)
my $ch=chr(0x5D0); print STDERR "$ch\n";
This is just an appearance. In reality the line doesn't kill the script at all. I was able to run the same script in an xemacs command shell, and it is clear that the script is running to completion. (see below for test code and output). I can also avoid sudden death by starting up xterm in wide character mode (xterm -u8 -wc).
I'd like to understand why U0D50 causes sudden death when wide character mode is off. Here in Israel, the first letter of the Hebrew alphabet (aleph) isn't exactly an exotic character. The more serious problem is that any test script I have also goes silent and appears to die if it prints out a diagnostic that contains that character unless it is running in a specially configured terminal. Not good.
Other utf8 characters sometimes display two characters where I expect 1, or display the wrong glyph (or a placeholder box). I could understand ugly output, but what is special about U05D that would make a terminal think it should stop displaying output sent to STDOUT and STDERR?
Also if there are any Israeli monks out there (or Hebrew speaking monks from other parts of the world) reading this who are familiar with this issue and have a work around they use, please speak up!
Platform details:
Debian (Lenny)
system perl (5.10.0)
xterm version: XTerm(235)
bash: GNU bash, version 3.2.39(1)-release (i486-pc-linux-gnu)
Test script:
use strict;
use warnings;
use PerlIO;
use Devel::Peek;
my $ch=chr(0x5D0);
Devel::Peek::Dump($ch);
binmode(STDERR);
print STDERR "layers for STDERR: @{[PerlIO::get_layers(STDERR)]}\n";
print STDERR "$ch\n"; #complains about wide character
binmode(STDERR, ":utf8");
print STDERR "layers for STDERR: @{[PerlIO::get_layers(STDERR)]}\n";
print STDERR "$ch\n"; # no complaints here
print STDERR "I survived :-) !!!\n";
print STDOUT "I really did. I really did.\n";
# End blocks to help verify that STDERR output is being
# truncated, and script is not merely aborting
END { warn "Ah...dead\n"; }
END { warn "I'm dying :-( \n" }
Output in Xemacs shell:
SV = PV(0x817c6d0) at 0x8197e90
REFCNT = 1
FLAGS = (PADMY,POK,pPOK,UTF8)
PV = 0x819e970 "\327\220"\0 [UTF8 "\x{5d0}"]
CUR = 2
LEN = 4
layers for STDERR: unix perlio
Wide character in print at Monks/Foo.pm line 916.
\220א
layers for STDERR: unix perlio utf8
\220א
I survived :-) !!!
I really did. I really did.
I'm dying :-(
Ah...dead
Output on xterm -u8 -wc (widechar on) - output is the same as xemacs except that U05D0 prints as "" not "\220"
SV = PV(0x817c6d0) at 0x8197e90
REFCNT = 1
FLAGS = (PADMY,POK,pPOK,UTF8)
PV = 0x819e970 "\327\220"\0 [UTF8 "\x{5d0}"]
CUR = 2
LEN = 4
layers for STDERR: unix perlio
Wide character in print at Monks/Foo.pm line 916.
layers for STDERR: unix perlio utf8
I survived :-) !!!
I really did. I really did.
I'm dying :-(
Ah...dead
Output on xterm -u8 (widechar off). Notice how everything after the wide character warning all output to STDOUT and STDERR disappear as if U05D) causes STDOUT and STDERR to close. Note that it does not hang. The script just terminates with no further visible output and a prompt for a new command appears.
$ perl myscript.pl
SV = PV(0x817c6d0) at 0x8197e90
REFCNT = 1
FLAGS = (PADMY,POK,pPOK,UTF8)
PV = 0x81a2560 "\327\220"\0 [UTF8 "\x{5d0}"]
CUR = 2
LEN = 4
layers for STDERR: unix perlio
Wide character in print at Monks/Foo.pm line 916.
$
Note: switching the order of output so that output to the STDOUT w/ a utf8 layer comes first does not improve the situation. Instead of dying after the warning, it dies silently on the print statement.
$ perl myscript.pl
SV = PV(0x817c6d0) at 0x8197e90
REFCNT = 1
FLAGS = (PADMY,POK,pPOK,UTF8)
PV = 0x819f610 "\327\220"\0 [UTF8 "\x{5d0}"]
CUR = 2
LEN = 4
layers for STDERR: unix perlio utf8
$
Update: clarified that the script terminates with no further output and does not hang.
-
Are you posting in the right place? Check out Where do I post X? to know for sure.
-
Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
<code> <a> <b> <big>
<blockquote> <br /> <dd>
<dl> <dt> <em> <font>
<h1> <h2> <h3> <h4>
<h5> <h6> <hr /> <i>
<li> <nbsp> <ol> <p>
<small> <strike> <strong>
<sub> <sup> <table>
<td> <th> <tr> <tt>
<u> <ul>
-
Snippets of code should be wrapped in
<code> tags not
<pre> tags. In fact, <pre>
tags should generally be avoided. If they must
be used, extreme care should be
taken to ensure that their contents do not
have long lines (<70 chars), in order to prevent
horizontal scrolling (and possible janitor
intervention).
-
Want more info? How to link
or How to display code and escape characters
are good places to start.