bulk88 has asked for the wisdom of the Perl Monks concerning the following question:
I am trying to use Encode::'s rawer API, and I am getting lots of crashes and warnings and other RANDOM behavior. Am I using Encode incorrectly or is this a bug with Encode?
edit: tests failing are fine, I didn't change the number or want them to pass since that changes the crash behavior and warnings a little bit, breathing on the script change behavior, if I run this script under Win32 Debugging Heap, it tells me allocations are corrupt (write to free, trashed allocation headers, etc) instead of crashing.
Throwing in an assert (" assert(s+ ulen + 1 == e);"), shows the problem is http://perl5.git.perl.org/perl.git/blob/5e0a247b35271159d629ea8562732e0993ed4594:/cpan/Encode/Unicode/Unicode.xs#l321 SvPVutf8 does not return SvPVX when the string SV is RO flagged ( see http://perl5.git.perl.org/perl.git/blob/5e0a247b35271159d629ea8562732e0993ed4594:/sv.c#l3060). So enc_pack() will write off the end of the PV buffer of SV result for the distance between char * s and char * e which are separate malloc blocks a random distance apart, not the beginning and end of SV utf8's PV buffer. It eventually crashes crashes (segv) either because unallocated VM between s and e, or unallocated VM after result SV's pv buffer was touched. Not sure what to do since Encode:: gets almost no maintenance (http://perl5.git.perl.org/perl.git/blob/5e0a247b35271159d629ea8562732e0993ed4594:/cpan/Encode/Changes), https://rt.cpan.org/Public/Dist/Display.html?Name=Encode and has many C bugs/crashes tickets open unanswered.
a callstack of the crash
edit: tests failing are fine, I didn't change the number or want them to pass since that changes the crash behavior and warnings a little bit, breathing on the script change behavior, if I run this script under Win32 Debugging Heap, it tells me allocations are corrupt (write to free, trashed allocation headers, etc) instead of crashing.
Throwing in an assert (" assert(s+ ulen + 1 == e);"), shows the problem is http://perl5.git.perl.org/perl.git/blob/5e0a247b35271159d629ea8562732e0993ed4594:/cpan/Encode/Unicode/Unicode.xs#l321 SvPVutf8 does not return SvPVX when the string SV is RO flagged ( see http://perl5.git.perl.org/perl.git/blob/5e0a247b35271159d629ea8562732e0993ed4594:/sv.c#l3060). So enc_pack() will write off the end of the PV buffer of SV result for the distance between char * s and char * e which are separate malloc blocks a random distance apart, not the beginning and end of SV utf8's PV buffer. It eventually crashes crashes (segv) either because unallocated VM between s and e, or unallocated VM after result SV's pv buffer was touched. Not sure what to do since Encode:: gets almost no maintenance (http://perl5.git.perl.org/perl.git/blob/5e0a247b35271159d629ea8562732e0993ed4594:/cpan/Encode/Changes), https://rt.cpan.org/Public/Dist/Display.html?Name=Encode and has many C bugs/crashes tickets open unanswered.
givesuse strict; use warnings; use Test::More tests => 17; use Encode; my $utf32encoder = Encode::find_encoding('UTF-32LE'); sub xs_edistance { my $a1 = $utf32encoder->encode(shift,0); my $a2 = $utf32encoder->encode(shift,0); } is( xs_edistance('four','fxxr'), 1, 'kgjsdfjkdsafs'); is( xs_edistance('four','FOuR'), 1, 'kgjsdfjkdsafs'); is( xs_edistance('four',''), 1, 'kgjsdfjkdsafs'); is( xs_edistance('','four'), 1, 'kgjsdfjkdsafs'); is( xs_edistance('',''), 1, 'kgjsdfjkdsafs'); is( xs_edistance('four','fxxr'), 1, 'kgjsdfjkdsafs'); is( xs_edistance('four','FOuR'), 1, 'kgjsdfjkdsafs'); is( xs_edistance('four',''), 1, 'kgjsdfjkdsafs'); is( xs_edistance('','four'), 1, 'kgjsdfjkdsafs'); is( xs_edistance('',''), 1, 'kgjsdfjkdsafs');
The crash happened at " my $a2 = $utf32encoder->encode(shift,0);". If I change anything in the script, either it will either not crash, crash quickly, or give infinite errors to console. Infinite error example1..17 Malformed UTF-8 character (unexpected non-continuation byte 0x01, imme +diately af ter start byte 0xe8) in subroutine entry at n1.pl line 9. Malformed UTF-8 character (unexpected non-continuation byte 0x60, imme +diately af ter start byte 0xc0) in subroutine entry at n1.pl line 9. Malformed UTF-8 character (unexpected continuation byte 0xb0, with no +preceding start byte) in subroutine entry at n1.pl line 9. Malformed UTF-8 character (unexpected continuation byte 0xb0, with no +preceding start byte) in subroutine entry at n1.pl line 9. Malformed UTF-8 character (unexpected continuation byte 0x98, with no +preceding start byte) in subroutine entry at n1.pl line 9. Malformed UTF-8 character (unexpected continuation byte 0x88, with no +preceding start byte) in subroutine entry at n1.pl line 9. Out of memory! Out of memory! *CRASH*
start byte) in subroutine entry at n1.pl line 9. Malformed UTF-8 character (unexpected continuation byte 0xa8, with no +preceding start byte) in subroutine entry at n1.pl line 9. Malformed UTF-8 character (unexpected continuation byte 0xb0, with no +preceding start byte) in subroutine entry at n1.pl line 9. Malformed UTF-8 character (unexpected continuation byte 0xa3, with no +preceding start byte) in subroutine entry at n1.pl line 9. Malformed UTF-8 character (unexpected non-continuation byte 0xf7, imme +diately a ter start byte 0xf4) in subroutine entry at n1.pl line 9. Malformed UTF-8 character (unexpected non-continuation byte 0x04, imme +diately a ter start byte 0xf7) in subroutine entry at n1.pl line 9. Malformed UTF-8 character (unexpected non-continuation byte 0x2e, imme +diately a ter start byte 0xc4) in subroutine entry at n1.pl line 9. Malformed UTF-8 character (unexpected continuation byte 0xa3, with no +preceding start byte) in subroutine entry at n1.pl line 9. Malformed UTF-8 character (unexpected continuation byte 0xa8, with no +preceding start byte) in subroutine entry at n1.pl line 9. Malformed UTF-8 character (unexpected continuation byte 0x8c, with no +preceding start byte) in subroutine entry at n1.pl line 9. Malformed UTF-8 character (unexpected continuation byte 0xa3, with no +preceding start byte) in subroutine entry at n1.pl line 9. Malformed UTF-8 character (unexpected continuation byte 0xb2, with no +preceding start byte) in subroutine entry at n1.pl line 9. Malformed UTF-8 character (unexpected continuation byte 0x80, with no +preceding start byte) in subroutine entry at n1.pl line 9. Malformed UTF-8 character (unexpected non-continuation byte 0x2e, imme +diately a ter start byte 0xcc) in subroutine entry at n1.pl line 9. Malformed UTF-8 character (unexpected continuation byte 0xa3, with no +preceding start byte) in subroutine entry at n1.pl line 9. Malformed UTF-8 character (unexpected continuation byte 0xa3, with no +preceding start byte) in subroutine entry at n1.pl line 9. Malformed UTF-8 character (unexpected non-continuation byte 0x2e, imme +diately a ter start byte 0xfc) in subroutine entry at n1.pl line 9. Malformed UTF-8 character (unexpected continuation byte 0xa3, with no +preceding start byte) in subroutine entry at n1.pl line 9. Malformed UTF-8 character (unexpected continuation byte 0xa8, with no +preceding start byte) in subroutine entry at n1.pl line 9. Malformed UTF-8 character (unexpected continuation byte 0xa3, with no +preceding start byte) in subroutine entry at n1.pl line 9. Malformed UTF-8 character (unexpected continuation byte 0xa3, with no +preceding start byte) in subroutine entry at n1.pl line 9. Malformed UTF-8 character (unexpected continuation byte 0xa6, with no +preceding start byte) in subroutine entry at n1.pl line 9. Malformed UTF-8 character (unexpected non-continuation byte 0x04, imme +diately a ter start byte 0xf6) in subroutine entry at n1.pl line 9. Malformed UTF-8 character (unexpected continuation byte 0xb6, with no +preceding start byte) in subroutine entry at n1.pl line 9. Malformed UTF-8 character (unexpected continuation byte 0x9c, with no +preceding start byte) in subroutine entry at n1.pl line 9. Malformed UTF-8 character (unexpected continuation byte 0x9b, with no +preceding start byte) in subroutine entry at n1.pl line 9. Malformed UTF-8 character (unexpected continuation byte 0xa3, with no +preceding start byte) in subroutine entry at n1.pl line 9. Malformed UTF-8 character (unexpected continuation byte 0xa3, with no +preceding start byte) in subroutine entry at n1.pl line 9. Malformed UTF-8 character (unexpected continuation byte 0xa8, with no +preceding start byte) in subroutine entry at n1.pl line 9. Malformed UTF-8 character (unexpected continuation byte 0x80, with no +preceding start byte) in subroutine entry at n1.pl line 9. Malformed UTF-8 character (unexpected continuation byte 0xa3, with no +preceding start byte) in subroutine entry at n1.pl line 9. Malformed UTF-8 character (unexpected continuation byte 0xa3, with no +preceding start byte) in subroutine entry at n1.pl line 9. Malformed UTF-8 character (unexpected continuation byte 0x8c, with no +preceding start byte) in subroutine entry at n1.pl line 9. Malformed UTF-8 character (unexpected non-continuation byte 0x04, imme +diately a ter start byte 0xf7) in subroutine entry at n1.pl line 9. Malformed UTF-8 character (unexpected continuation byte 0xa3, with no +preceding start byte) in subroutine entry at n1.pl line 9. Malformed UTF-8 character (unexpected continuation byte 0xa8, with no +preceding start byte) in subroutine entry at n1.pl line 9. Malformed UTF-8 character (unexpected continuation byte 0xa3, with no +preceding start byte) in subroutine entry at n1.pl line 9. Malformed UTF-8 character (unexpected continuation byte 0xa3, with no +preceding start byte) in subroutine entry at n1.pl line 9. Malformed UTF-8 character (unexpected continuation byte 0x98, with no +preceding start byte) in subroutine entry at n1.pl line 9. Malformed UTF-8 character (unexpected continuation byte 0xa3, with no +preceding start byte) in subroutine entry at n1.pl line 9. Malformed UTF-8 character (unexpected continuation byte 0xa3, with no +preceding start byte) in subroutine entry at n1.pl line 9. Malformed UTF-8 character (unexpected continuation byte 0xa3, with no +preceding start byte) in subroutine entry at n1.pl line 9. Malformed UTF-8 character (unexpected continuation byte 0xa8, with no +preceding start byte) in subroutine entry at n1.pl line 9. Malformed UTF-8 character (unexpected continuation byte 0xa3, with no +preceding start byte) in subroutine entry at n1.pl line 9. Malformed UTF-8 character (unexpected continuation byte 0xa3, with no +preceding start byte) in subroutine entry at n1.pl line 9. Malformed UTF-8 character (unexpected continuation byte 0xa8, with no +preceding start byte) in subroutine entry at n1.pl line 9. Malformed UTF-8 character (unexpected continuation byte 0xa3, with no +preceding start byte) in subroutine entry at n1.pl line 9. Malformed UTF-8 character (unexpected continuation byte 0xa3, with no +preceding start byte) in subroutine entry at n1.pl line 9. Malformed UTF-8 character (unexpected continuation byte 0x80, with no +preceding start byte) in subroutine entry at n1.pl line 9. Malformed UTF-8 character (unexpected continuation byte 0xa3, with no +preceding start byte) in subroutine entry at n1.pl line 9. Malformed UTF-8 character (unexpected continuation byte 0x9c, with no +preceding start byte) in subroutine entry at n1.pl line 9. Malformed UTF-8 character (unexpected continuation byte 0xa3, with no +preceding start byte) in subroutine entry at n1.pl line 9. Malformed UTF-8 character (unexpected continuation byte 0xa8, with no +preceding start byte) in subroutine entry at n1.pl line 9. Malformed UTF-8 character (unexpected continuation byte 0xa3, with no +preceding start byte) in subroutine entry at n1.pl line 9. Malformed UTF-8 character (unexpected non-continuation byte 0xf7, imme +diately a ter start byte 0xf4) in subroutine entry at n1.pl line 9. Malformed UTF-8 character (unexpected non-continuation byte 0x04, imme +diately a ter start byte 0xf7) in subroutine entry at n1.pl line 9. Malformed UTF-8 character (unexpected non-continuation byte 0x2f, imme +diately a ter start byte 0xdc) in subroutine entry at n1.pl line 9. Malformed UTF-8 character (unexpected continuation byte 0xa3, with no +preceding start byte) in subroutine entry at n1.pl line 9. Malformed UTF-8 character (unexpected non-continuation byte 0x2f, imme +diately a ter start byte 0xd4) in subroutine entry at n1.pl line 9. Malformed UTF-8 character (unexpected continuation byte 0xa3, with no +preceding start byte) in subroutine entry at n1.pl line 9. Malformed UTF-8 character (unexpected continuation byte 0xa8, with no +preceding start byte) in subroutine entry at n1.pl line 9. Malformed UTF-8 character (unexpected continuation byte 0x84, with no +preceding start byte) in subroutine entry at n1.pl line 9. Malformed UTF-8 character (unexpected continuation byte 0xa3, with no +preceding start byte) in subroutine entry at n1.pl line 9. Malformed UTF-8 character (unexpected non-continuation byte 0x2f, imme +diately a ter start byte 0xdc) in subroutine entry at n1.pl line 9. Malformed UTF-8 character (unexpected continuation byte 0xa3, with no +preceding start byte) in subroutine entry at n1.pl line 9. Malformed UTF-8 character (unexpected continuation byte 0xa6, with no +preceding start byte) in subroutine entry at n1.pl line 9. Malformed UTF-8 character (unexpected non-continuation byte 0x04, imme +diately a ter start byte 0xf6) in subroutine entry at n1.pl line 9. Malformed UTF-8 character (unexpected continuation byte 0xb6, with no +preceding start byte) in subroutine entry at n1.pl line 9. Malformed UTF-8 character (unexpected continuation byte 0xa3, with no +preceding start byte) in subroutine entry at n1.pl line 9. Malformed UTF-8 character (unexpected non-continuation byte 0x2f, imme +diately a ter start byte 0xf4) in subroutine entry at n1.pl line 9. Malformed UTF-8 character (unexpected continuation byte 0xa3, with no +preceding start byte) in subroutine entry at n1.pl line 9. Malformed UTF-8 character (unexpected continuation byte 0xa8, with no +preceding start byte) in subroutine entry at n1.pl line 9. Malformed UTF-8 character (unexpected non-continuation byte 0x2e, imme +diately a ter start byte 0xcc) in subroutine entry at n1.pl line 9. Malformed UTF-8 character (unexpected continuation byte 0xa3, with no +preceding start byte) in subroutine entry at n1.pl line 9. Malformed UTF-8 character (unexpected non-continuation byte 0x2e, imme +diately a ter start byte 0xcc) in subroutine entry at n1.pl line 9. Malformed UTF-8 character (unexpected continuation byte 0xa3, with no +preceding start byte) in subroutine entry at n1.pl line 9. Malformed UTF-8 character (unexpected continuation byte 0x82, with no +preceding start byte) in subroutine entry at n1.pl line 9. Malformed UTF-8 character (unexpected non-continuation byte 0x04, imme +diately a ter start byte 0xf7) in subroutine entry at n1.pl line 9. Malformed UTF-8 character (unexpected non-continuation byte 0x2f, imme +diately a ter start byte 0xfc) in subroutine entry at n1.pl line 9. Malformed UTF-8 character (unexpected continuation byte 0xa3, with no +preceding start byte) in subroutine entry at n1.pl line 9. Malformed UTF-8 character (unexpected continuation byte 0xa3, with no +preceding start byte) in subroutine entry at n1.pl line 9. Malformed UTF-8 character (unexpected continuation byte 0xa8, with no +preceding start byte) in subroutine entry at n1.pl line 9. Malformed UTF-8 character (unexpected non-continuation byte 0x2e, imme +diately a ter start byte 0xcc) in subroutine entry at n1.pl line 9. Malformed UTF-8 character (unexpected continuation byte 0xa3, with no +preceding start byte) in subroutine entry at n1.pl line 9. Malformed UTF-8 character (unexpected continuation byte 0xa6, with no +preceding start byte) in subroutine entry at n1.pl line 9. Malformed UTF-8 character (unexpected continuation byte 0xa3, with no +preceding start byte) in subroutine entry at n1.pl line 9. Malformed UTF-8 character (unexpected continuation byte 0xa3, with no +preceding start byte) in subroutine entry at n1.pl line 9. Malformed UTF-8 character (unexpected continuation byte 0xa3, with no +preceding start byte) in subroutine entry at n1.pl line 9. Malformed UTF-8 character (unexpected continuation byte 0xa8, with no +preceding start byte) in subroutine entry at n1.pl line 9. Malformed UTF-8 character (unexpected non-continuation byte 0x2f, imme +diately a ter start byte 0xdc) in subroutine entry at n1.pl line 9. Malformed UTF-8 character (unexpected continuation byte 0xa3, with no +preceding start byte) in subroutine entry at n1.pl line 9. Malformed UTF-8 character (unexpected continuation byte 0x82, with no +preceding start byte) in subroutine entry at n1.pl line 9. Malformed UTF-8 character (unexpected non-continuation byte 0x04, imme +diately a ter start byte 0xf7) in subroutine entry at n1.pl line 9. Malformed UTF-8 character (unexpected continuation byte 0x85, with no +preceding start byte) in subroutine entry at n1.pl line 9. Malformed UTF-8 character (unexpected continuation byte 0xa4, with no +preceding start byte) in subroutine entry at n1.pl line 9. Malformed UTF-8 character (unexpected continuation byte 0xa3, with no +preceding start byte) in subroutine entry at n1.pl line 9. Malformed UTF-8 character (unexpected continuation byte 0xa3, with no +preceding start byte) in subroutine entry at n1.pl line 9. Malformed UTF-8 character (unexpected continuation byte 0xa8, with no +preceding start byte) in subroutine entry at n1.pl line 9.
It tried to read point 0x25. Another crashntdll.dll!_RtlAllocateHeap@12() + 0x26916 msvcr71.dll!_heap_alloc(unsigned int size=124) Line 212 C msvcr71.dll!_nh_malloc(unsigned int size=124, int nhFlag=0) Line + 113 C msvcr71.dll!malloc(unsigned int size=124) Line 54 + 0xf C > perl517.dll!VMem::Malloc(unsigned int size=112) Line 151 + 0xe + C perl517.dll!PerlMemMalloc(IPerlMem * piPerl=0x0034815c, unsigned +int size=112) Line 299 + 0x14 C perl517.dll!Perl_safesysmalloc(unsigned int size=112) Line 92 + C perl517.dll!Perl_av_extend_guts(interpreter * my_perl=0x00342c14, + av * av=0x00ad15b4, long key=28, int * maxp=0x00a30a24, sv * * * all +ocp=0x00b06280, sv * * * arrayp=0x00ad15c0) Line 163 + 0x31 C perl517.dll!Perl_av_extend(interpreter * my_perl=0x00342c14, av * + av=0x00000000, long key=12) Line 83 + 0x17 C perl517.dll!Perl_sv_add_backref(interpreter * my_perl=0x00342c14, + sv * const tsv=0x00ad1654, sv * const sv=0x00ad6784) Line 5627 + 0x +b C perl517.dll!Perl_gv_init_pvn(interpreter * my_perl=0x00342c14, gv + * gv=0x00ad6784, hv * stash=0x00ad1654, const char * name=0x280d592c +, unsigned int len=7, unsigned long flags=2) Line 382 + 0x8 C perl517.dll!Perl_gv_fetchmeth_pvn(interpreter * my_perl=0x00342c1 +4, hv * stash=0x00000007, const char * name=0x280d592c, unsigned int +len=7, long level=0, unsigned long flags=0) Line 692 + 0x16 C perl517.dll!Perl_gv_fetchmeth_pvn_autoload(interpreter * my_perl= +0x00342c14, hv * stash=0x00ad1654, const char * name=0x280d592c, unsi +gned int len=7, long level=0, unsigned long flags=0) Line 857 C perl517.dll!S_curse(interpreter * my_perl=0x00b06280, sv * const +sv=0x00ad0c64, const char check_refcnt='') Line 6446 + 0x12 C perl517.dll!Perl_sv_clear(interpreter * my_perl=0x00342c14, sv * +const orig_sv=0x00ad0c64) Line 6117 + 0xb C perl517.dll!Perl_sv_free2(interpreter * my_perl=0x00342c14, sv * +const sv=0x00ad0c64, const unsigned long rc=1) Line 6584 C perl517.dll!S_SvREFCNT_dec(interpreter * my_perl=0x00342c14, sv * + sv=0x00000020) Line 62 + 0xb C perl517.dll!do_clean_objs(interpreter * my_perl=0x00342c14, sv * +const ref=0x00ad0a04) Line 480 + 0x13 C perl517.dll!S_visit(interpreter * my_perl=0x00342c14, void (inter +preter *, sv *)* f=0x28083583, const unsigned long flags=2048, const +unsigned long mask=2048) Line 423 C perl517.dll!Perl_sv_clean_objs(interpreter * my_perl=0x00b06280) + Line 581 C perl517.dll!perl_destruct(interpreter * my_perl=0x00342c14) Line + 772 C perl517.dll!RunPerl(int argc=2, char * * argv=0x01345c98, char * +* env=0x00343f70) Line 275 C perl.exe!mainCRTStartup() Line 398 + 0xe C kernel32.dll!_BaseProcessStart@4() + 0x23
> Unicode.dll!enc_pack(interpreter * my_perl=0x00342c14, sv * resul +t=0x00000000, unsigned int size=11343540, unsigned char endian='V', u +nsigned long value=86) Line 104 C Unicode.dll!XS_Encode__Unicode_encode_xs(interpreter * my_perl=0x +00342c00, cv * cv=0x00ad0d34) Line 378 + 0x16 C perl517.dll!Perl_pp_entersub(interpreter * my_perl=0x00000002) L +ine 2877 C perl517.dll!Perl_runops_standard(interpreter * my_perl=0x00342c14 +) Line 42 + 0x4 C perl517.dll!S_run_body(interpreter * my_perl=0x00000004, long old +scope=1) Line 2430 + 0xa C perl517.dll!perl_run(interpreter * my_perl=0x00342c14) Line 2346 + + 0x8 C perl517.dll!RunPerl(int argc=2, char * * argv=0x01345c98, char * +* env=0x00343f70) Line 270 + 0x6 C perl.exe!mainCRTStartup() Line 398 + 0xe C kernel32.dll!_BaseProcessStart@4() + 0x23
Back to
Seekers of Perl Wisdom