http://www.perlmonks.org?node_id=1026042

vsespb has asked for the wisdom of the Perl Monks concerning the following question:

#!/usr/bin/env perl use strict; use warnings; $SIG{__DIE__} = sub { if ($^S == 0) { for my $s (0..$#_) { dcs($_[$s]); }; } }; sub dcs { my ($p1, $p2) = @_; my $cs=''; for (my $i=1; $i<3; $i++) { my ($package, $filename, $line, $subroutine) = caller($i); last if ( ! defined($package) ); print STDERR $filename; } print STDERR $p1; } sub mysub { die; } binmode STDERR, ":encoding(koi8-r)"; mysub(); 1;
This code just hangs if path to perl program contain UTF-8 characters (like 'perl тест/poc.pl').

It can reproduce it on two linux boxes (perl 10 and perl 12). Both have UTF-8 locale, UTF-8 filenames.

I hangs with different single-byte encodings (i tried koi-8 and cp1251). It hangs only when STDERR used and only when die is inside mysub.

The reason why I am trying to debug such a weird code in these weird circumstances - I have another script with similar code and a user with similar problem, however I cannot reproduce his problem in full. So I came to this PoC - I think maybe it's related.

Also this user does not do fancy things like using non-ASCII characters in filenames or using encoding different from locale encoding. However he runs FreeBSD and his locale is koi8-r (and my program sets STDERR encoding to KOI-8 too)

UPD: Perl hangs with 100% CPU usage and strace hangs with:
open("/usr/lib/perl/5.10/auto/Encode/Byte/Byte.so", O_RDONLY) = 3 read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\240\\\3\0\0\0\ +0\0"..., 832) = 832 fstat(3, {st_mode=S_IFREG|0644, st_size=375048, ...}) = 0 mmap(NULL, 2470176, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, + 0) = 0x7fcef2b27000 mprotect(0x7fcef2b62000, 2093056, PROT_NONE) = 0 mmap(0x7fcef2d61000, 139264, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIX +ED|MAP_DENYWRITE, 3, 0x3a000) = 0x7fcef2d61000 close(3) = 0 mprotect(0x7fcef2d61000, 135168, PROT_READ) = 0
UPD: Shortest PoC:
#!/usr/bin/env perl use strict; use warnings; use Carp; $SIG{__DIE__} = sub { print STDERR $_ for (@_); }; binmode STDERR, ":encoding(koi8-r)"; sub mysub { croak; } mysub(); 1;
Again, should be ran as 'perl тест/poc.pl'. In this version die() does not reproduce problem - only 'croak'. Also cp1251 works without problem.

Replies are listed 'Best First'.
Re: Weird STDERR/SIGDIE/Encodings issue
by McA (Priest) on Mar 28, 2013 at 22:54 UTC

    In sub dsc try the following:

    print STDERR decode('UTF-8', $filename);

    As far as I can see you get an encoding error in dsc which is called in a die handler and I'm pretty sure this causes a recursion.

    McA

      Almost any change to that code fixes it. Yes, including decode.

      I more interesting in finding root of problem, as user have crashes in a bit different circumstances.

      Also in my program I will be unable to just "decode" things, as it prints stacktrace, including function argument and sometimes I have binary data (not character string) or even filenames in binary format (not UTF-8, because FreeBSD filesystem sometimes use non-UTF-8 filenames)

      I think die inside die handler cannot cause a recursion:

      http://perldoc.perl.org/perlvar.html#%25SIG

       The __DIE__ handler is explicitly disabled during the call, so that you can die from a __DIE__ handler

      also such recursion would produce stackoverflow fast.

      Currently it looks to me that it's just a bug in Encode XS code, and I can workaround it by _escaping_ any non-ASCII octets (both binary data and wide characters)

        Concerning the recursion I speculated about the following: You're right that the sig handler is disabled, but the standard die functionality is done anyway. So the error string is printed via STDERR on which you enabled the encoding IO layer with binmode. The following code reproduces the error:

        #!/usr/bin/perl use strict; use warnings; use Data::Dumper; use Encode; $SIG{__DIE__} = sub { if (defined $^S && $^S == 0) { for my $s (0..$#_) { dcs($_[$s]); }; } }; sub dcs { print Dumper(\@_); my ($p1, $p2) = @_; my @caller = caller(0); print Dumper(\@caller); print STDERR $caller[1]; print STDERR "me: " . $p1; } sub mysub { die; } binmode STDERR, ":encoding(koi8-r)"; binmode STDOUT, ":encoding(koi8-r)"; mysub();

        As soon as you change the line

        print STDERR $caller[1];
        by
        print STDOUT $caller[1];
        you don't get the hanging behaviour.

        UPDATE: If you keep the first version but you insert the follwoing line in the if-clause of your die-handler

        binmode STDERR, ':raw';
        you also don't get the hanging behaviour even when printing at STDERR.

        McA