<?xml version="1.0" encoding="windows-1252"?>
<node id="1013162" title="random crashes, memory corruption, warnings from Encode" created="2013-01-14 01:57:11" updated="2013-01-14 01:57:11">
<type id="115">
perlquestion</type>
<author id="857302">
bulk88</author>
<data>
<field name="doctext">
I am trying to use Encode::'s rawer API, and I am getting lots of crashes and warnings and other RANDOM behavior. Am I using Encode incorrectly or is this a bug with Encode?
&lt;br&gt;&lt;br&gt;
edit: tests failing are fine, I didn't change the number or want them to pass since that changes the crash behavior and warnings a little bit, breathing on the script change behavior, if I run this script under Win32 Debugging Heap, it tells me allocations are corrupt (write to free, trashed allocation headers, etc) instead of crashing.
&lt;br&gt;&lt;br&gt;
Throwing in an assert ("    assert(s+ ulen + 1 == e);"), shows the problem is 
[href://http://perl5.git.perl.org/perl.git/blob/5e0a247b35271159d629ea8562732e0993ed4594:/cpan/Encode/Unicode/Unicode.xs#l321] SvPVutf8 does not return SvPVX when the string SV is RO flagged ( see [href://http://perl5.git.perl.org/perl.git/blob/5e0a247b35271159d629ea8562732e0993ed4594:/sv.c#l3060]). So enc_pack() will write off the end of the PV buffer of SV result for the distance between char * s and char * e which are separate malloc blocks a random distance apart, not the beginning and end of SV utf8's PV buffer. It eventually crashes  crashes (segv) either because unallocated VM between s and e, or unallocated VM after result SV's pv buffer was touched. Not sure what to do since Encode:: gets almost no maintenance ([href://http://perl5.git.perl.org/perl.git/blob/5e0a247b35271159d629ea8562732e0993ed4594:/cpan/Encode/Changes]), [href://https://rt.cpan.org/Public/Dist/Display.html?Name=Encode] and has many C bugs/crashes tickets open unanswered.

&lt;code&gt;
use strict;
use warnings;
use Test::More tests =&gt; 17;
use Encode;

my $utf32encoder = Encode::find_encoding('UTF-32LE');
sub xs_edistance {
    my $a1 = $utf32encoder-&gt;encode(shift,0);
    my $a2 = $utf32encoder-&gt;encode(shift,0);
}



is( xs_edistance('four','fxxr'), 1, 'kgjsdfjkdsafs');
is( xs_edistance('four','FOuR'), 1, 'kgjsdfjkdsafs');
is( xs_edistance('four',''), 1, 'kgjsdfjkdsafs');
is( xs_edistance('','four'), 1, 'kgjsdfjkdsafs');
is( xs_edistance('',''), 1, 'kgjsdfjkdsafs');
is( xs_edistance('four','fxxr'), 1, 'kgjsdfjkdsafs');
is( xs_edistance('four','FOuR'), 1, 'kgjsdfjkdsafs');
is( xs_edistance('four',''), 1, 'kgjsdfjkdsafs');
is( xs_edistance('','four'), 1, 'kgjsdfjkdsafs');
is( xs_edistance('',''), 1, 'kgjsdfjkdsafs');
&lt;/code&gt;
gives
&lt;code&gt;
1..17
Malformed UTF-8 character (unexpected non-continuation byte 0x01, immediately af
ter start byte 0xe8) in subroutine entry at n1.pl line 9.
Malformed UTF-8 character (unexpected non-continuation byte 0x60, immediately af
ter start byte 0xc0) in subroutine entry at n1.pl line 9.
Malformed UTF-8 character (unexpected continuation byte 0xb0, with no preceding
start byte) in subroutine entry at n1.pl line 9.
Malformed UTF-8 character (unexpected continuation byte 0xb0, with no preceding
start byte) in subroutine entry at n1.pl line 9.
Malformed UTF-8 character (unexpected continuation byte 0x98, with no preceding
start byte) in subroutine entry at n1.pl line 9.
Malformed UTF-8 character (unexpected continuation byte 0x88, with no preceding
start byte) in subroutine entry at n1.pl line 9.
Out of memory!
Out of memory!
*CRASH*
&lt;/code&gt;
The crash happened at "    my $a2 = $utf32encoder-&gt;encode(shift,0);". If I change anything in the script, either it will either not crash, crash quickly, or give infinite errors to console. Infinite error example
&lt;p&gt;
&lt;readmore&gt;
&lt;code&gt;
start byte) in subroutine entry at n1.pl line 9.
Malformed UTF-8 character (unexpected continuation byte 0xa8, with no preceding
start byte) in subroutine entry at n1.pl line 9.
Malformed UTF-8 character (unexpected continuation byte 0xb0, with no preceding
start byte) in subroutine entry at n1.pl line 9.
Malformed UTF-8 character (unexpected continuation byte 0xa3, with no preceding
start byte) in subroutine entry at n1.pl line 9.
Malformed UTF-8 character (unexpected non-continuation byte 0xf7, immediately a
ter start byte 0xf4) in subroutine entry at n1.pl line 9.
Malformed UTF-8 character (unexpected non-continuation byte 0x04, immediately a
ter start byte 0xf7) in subroutine entry at n1.pl line 9.
Malformed UTF-8 character (unexpected non-continuation byte 0x2e, immediately a
ter start byte 0xc4) in subroutine entry at n1.pl line 9.
Malformed UTF-8 character (unexpected continuation byte 0xa3, with no preceding
start byte) in subroutine entry at n1.pl line 9.
Malformed UTF-8 character (unexpected continuation byte 0xa8, with no preceding
start byte) in subroutine entry at n1.pl line 9.
Malformed UTF-8 character (unexpected continuation byte 0x8c, with no preceding
start byte) in subroutine entry at n1.pl line 9.
Malformed UTF-8 character (unexpected continuation byte 0xa3, with no preceding
start byte) in subroutine entry at n1.pl line 9.
Malformed UTF-8 character (unexpected continuation byte 0xb2, with no preceding
start byte) in subroutine entry at n1.pl line 9.
Malformed UTF-8 character (unexpected continuation byte 0x80, with no preceding
start byte) in subroutine entry at n1.pl line 9.
Malformed UTF-8 character (unexpected non-continuation byte 0x2e, immediately a
ter start byte 0xcc) in subroutine entry at n1.pl line 9.
Malformed UTF-8 character (unexpected continuation byte 0xa3, with no preceding
start byte) in subroutine entry at n1.pl line 9.
Malformed UTF-8 character (unexpected continuation byte 0xa3, with no preceding
start byte) in subroutine entry at n1.pl line 9.
Malformed UTF-8 character (unexpected non-continuation byte 0x2e, immediately a
ter start byte 0xfc) in subroutine entry at n1.pl line 9.
Malformed UTF-8 character (unexpected continuation byte 0xa3, with no preceding
start byte) in subroutine entry at n1.pl line 9.
Malformed UTF-8 character (unexpected continuation byte 0xa8, with no preceding
start byte) in subroutine entry at n1.pl line 9.
Malformed UTF-8 character (unexpected continuation byte 0xa3, with no preceding
start byte) in subroutine entry at n1.pl line 9.
Malformed UTF-8 character (unexpected continuation byte 0xa3, with no preceding
start byte) in subroutine entry at n1.pl line 9.
Malformed UTF-8 character (unexpected continuation byte 0xa6, with no preceding
start byte) in subroutine entry at n1.pl line 9.
Malformed UTF-8 character (unexpected non-continuation byte 0x04, immediately a
ter start byte 0xf6) in subroutine entry at n1.pl line 9.
Malformed UTF-8 character (unexpected continuation byte 0xb6, with no preceding
start byte) in subroutine entry at n1.pl line 9.
Malformed UTF-8 character (unexpected continuation byte 0x9c, with no preceding
start byte) in subroutine entry at n1.pl line 9.
Malformed UTF-8 character (unexpected continuation byte 0x9b, with no preceding
start byte) in subroutine entry at n1.pl line 9.
Malformed UTF-8 character (unexpected continuation byte 0xa3, with no preceding
start byte) in subroutine entry at n1.pl line 9.
Malformed UTF-8 character (unexpected continuation byte 0xa3, with no preceding
start byte) in subroutine entry at n1.pl line 9.
Malformed UTF-8 character (unexpected continuation byte 0xa8, with no preceding
start byte) in subroutine entry at n1.pl line 9.
Malformed UTF-8 character (unexpected continuation byte 0x80, with no preceding
start byte) in subroutine entry at n1.pl line 9.
Malformed UTF-8 character (unexpected continuation byte 0xa3, with no preceding
start byte) in subroutine entry at n1.pl line 9.
Malformed UTF-8 character (unexpected continuation byte 0xa3, with no preceding
start byte) in subroutine entry at n1.pl line 9.
Malformed UTF-8 character (unexpected continuation byte 0x8c, with no preceding
start byte) in subroutine entry at n1.pl line 9.
Malformed UTF-8 character (unexpected non-continuation byte 0x04, immediately a
ter start byte 0xf7) in subroutine entry at n1.pl line 9.
Malformed UTF-8 character (unexpected continuation byte 0xa3, with no preceding
start byte) in subroutine entry at n1.pl line 9.
Malformed UTF-8 character (unexpected continuation byte 0xa8, with no preceding
start byte) in subroutine entry at n1.pl line 9.
Malformed UTF-8 character (unexpected continuation byte 0xa3, with no preceding
start byte) in subroutine entry at n1.pl line 9.
Malformed UTF-8 character (unexpected continuation byte 0xa3, with no preceding
start byte) in subroutine entry at n1.pl line 9.
Malformed UTF-8 character (unexpected continuation byte 0x98, with no preceding
start byte) in subroutine entry at n1.pl line 9.
Malformed UTF-8 character (unexpected continuation byte 0xa3, with no preceding
start byte) in subroutine entry at n1.pl line 9.
Malformed UTF-8 character (unexpected continuation byte 0xa3, with no preceding
start byte) in subroutine entry at n1.pl line 9.
Malformed UTF-8 character (unexpected continuation byte 0xa3, with no preceding
start byte) in subroutine entry at n1.pl line 9.
Malformed UTF-8 character (unexpected continuation byte 0xa8, with no preceding
start byte) in subroutine entry at n1.pl line 9.
Malformed UTF-8 character (unexpected continuation byte 0xa3, with no preceding
start byte) in subroutine entry at n1.pl line 9.
Malformed UTF-8 character (unexpected continuation byte 0xa3, with no preceding
start byte) in subroutine entry at n1.pl line 9.
Malformed UTF-8 character (unexpected continuation byte 0xa8, with no preceding
start byte) in subroutine entry at n1.pl line 9.
Malformed UTF-8 character (unexpected continuation byte 0xa3, with no preceding
start byte) in subroutine entry at n1.pl line 9.
Malformed UTF-8 character (unexpected continuation byte 0xa3, with no preceding
start byte) in subroutine entry at n1.pl line 9.
Malformed UTF-8 character (unexpected continuation byte 0x80, with no preceding
start byte) in subroutine entry at n1.pl line 9.
Malformed UTF-8 character (unexpected continuation byte 0xa3, with no preceding
start byte) in subroutine entry at n1.pl line 9.
Malformed UTF-8 character (unexpected continuation byte 0x9c, with no preceding
start byte) in subroutine entry at n1.pl line 9.
Malformed UTF-8 character (unexpected continuation byte 0xa3, with no preceding
start byte) in subroutine entry at n1.pl line 9.
Malformed UTF-8 character (unexpected continuation byte 0xa8, with no preceding
start byte) in subroutine entry at n1.pl line 9.
Malformed UTF-8 character (unexpected continuation byte 0xa3, with no preceding
start byte) in subroutine entry at n1.pl line 9.
Malformed UTF-8 character (unexpected non-continuation byte 0xf7, immediately a
ter start byte 0xf4) in subroutine entry at n1.pl line 9.
Malformed UTF-8 character (unexpected non-continuation byte 0x04, immediately a
ter start byte 0xf7) in subroutine entry at n1.pl line 9.
Malformed UTF-8 character (unexpected non-continuation byte 0x2f, immediately a
ter start byte 0xdc) in subroutine entry at n1.pl line 9.
Malformed UTF-8 character (unexpected continuation byte 0xa3, with no preceding
start byte) in subroutine entry at n1.pl line 9.
Malformed UTF-8 character (unexpected non-continuation byte 0x2f, immediately a
ter start byte 0xd4) in subroutine entry at n1.pl line 9.
Malformed UTF-8 character (unexpected continuation byte 0xa3, with no preceding
start byte) in subroutine entry at n1.pl line 9.
Malformed UTF-8 character (unexpected continuation byte 0xa8, with no preceding
start byte) in subroutine entry at n1.pl line 9.
Malformed UTF-8 character (unexpected continuation byte 0x84, with no preceding
start byte) in subroutine entry at n1.pl line 9.
Malformed UTF-8 character (unexpected continuation byte 0xa3, with no preceding
start byte) in subroutine entry at n1.pl line 9.
Malformed UTF-8 character (unexpected non-continuation byte 0x2f, immediately a
ter start byte 0xdc) in subroutine entry at n1.pl line 9.
Malformed UTF-8 character (unexpected continuation byte 0xa3, with no preceding
start byte) in subroutine entry at n1.pl line 9.
Malformed UTF-8 character (unexpected continuation byte 0xa6, with no preceding
start byte) in subroutine entry at n1.pl line 9.
Malformed UTF-8 character (unexpected non-continuation byte 0x04, immediately a
ter start byte 0xf6) in subroutine entry at n1.pl line 9.
Malformed UTF-8 character (unexpected continuation byte 0xb6, with no preceding
start byte) in subroutine entry at n1.pl line 9.
Malformed UTF-8 character (unexpected continuation byte 0xa3, with no preceding
start byte) in subroutine entry at n1.pl line 9.
Malformed UTF-8 character (unexpected non-continuation byte 0x2f, immediately a
ter start byte 0xf4) in subroutine entry at n1.pl line 9.
Malformed UTF-8 character (unexpected continuation byte 0xa3, with no preceding
start byte) in subroutine entry at n1.pl line 9.
Malformed UTF-8 character (unexpected continuation byte 0xa8, with no preceding
start byte) in subroutine entry at n1.pl line 9.
Malformed UTF-8 character (unexpected non-continuation byte 0x2e, immediately a
ter start byte 0xcc) in subroutine entry at n1.pl line 9.
Malformed UTF-8 character (unexpected continuation byte 0xa3, with no preceding
start byte) in subroutine entry at n1.pl line 9.
Malformed UTF-8 character (unexpected non-continuation byte 0x2e, immediately a
ter start byte 0xcc) in subroutine entry at n1.pl line 9.
Malformed UTF-8 character (unexpected continuation byte 0xa3, with no preceding
start byte) in subroutine entry at n1.pl line 9.
Malformed UTF-8 character (unexpected continuation byte 0x82, with no preceding
start byte) in subroutine entry at n1.pl line 9.
Malformed UTF-8 character (unexpected non-continuation byte 0x04, immediately a
ter start byte 0xf7) in subroutine entry at n1.pl line 9.
Malformed UTF-8 character (unexpected non-continuation byte 0x2f, immediately a
ter start byte 0xfc) in subroutine entry at n1.pl line 9.
Malformed UTF-8 character (unexpected continuation byte 0xa3, with no preceding
start byte) in subroutine entry at n1.pl line 9.
Malformed UTF-8 character (unexpected continuation byte 0xa3, with no preceding
start byte) in subroutine entry at n1.pl line 9.
Malformed UTF-8 character (unexpected continuation byte 0xa8, with no preceding
start byte) in subroutine entry at n1.pl line 9.
Malformed UTF-8 character (unexpected non-continuation byte 0x2e, immediately a
ter start byte 0xcc) in subroutine entry at n1.pl line 9.
Malformed UTF-8 character (unexpected continuation byte 0xa3, with no preceding
start byte) in subroutine entry at n1.pl line 9.
Malformed UTF-8 character (unexpected continuation byte 0xa6, with no preceding
start byte) in subroutine entry at n1.pl line 9.
Malformed UTF-8 character (unexpected continuation byte 0xa3, with no preceding
start byte) in subroutine entry at n1.pl line 9.
Malformed UTF-8 character (unexpected continuation byte 0xa3, with no preceding
start byte) in subroutine entry at n1.pl line 9.
Malformed UTF-8 character (unexpected continuation byte 0xa3, with no preceding
start byte) in subroutine entry at n1.pl line 9.
Malformed UTF-8 character (unexpected continuation byte 0xa8, with no preceding
start byte) in subroutine entry at n1.pl line 9.
Malformed UTF-8 character (unexpected non-continuation byte 0x2f, immediately a
ter start byte 0xdc) in subroutine entry at n1.pl line 9.
Malformed UTF-8 character (unexpected continuation byte 0xa3, with no preceding
start byte) in subroutine entry at n1.pl line 9.
Malformed UTF-8 character (unexpected continuation byte 0x82, with no preceding
start byte) in subroutine entry at n1.pl line 9.
Malformed UTF-8 character (unexpected non-continuation byte 0x04, immediately a
ter start byte 0xf7) in subroutine entry at n1.pl line 9.
Malformed UTF-8 character (unexpected continuation byte 0x85, with no preceding
start byte) in subroutine entry at n1.pl line 9.
Malformed UTF-8 character (unexpected continuation byte 0xa4, with no preceding
start byte) in subroutine entry at n1.pl line 9.
Malformed UTF-8 character (unexpected continuation byte 0xa3, with no preceding
start byte) in subroutine entry at n1.pl line 9.
Malformed UTF-8 character (unexpected continuation byte 0xa3, with no preceding
start byte) in subroutine entry at n1.pl line 9.
Malformed UTF-8 character (unexpected continuation byte 0xa8, with no preceding
start byte) in subroutine entry at n1.pl line 9.
&lt;/code&gt;
&lt;/readmore&gt;
&lt;/p&gt;
a callstack of the crash
&lt;code&gt;
 	ntdll.dll!_RtlAllocateHeap@12()  + 0x26916	
 	msvcr71.dll!_heap_alloc(unsigned int size=124)  Line 212	C
 	msvcr71.dll!_nh_malloc(unsigned int size=124, int nhFlag=0)  Line 113	C
 	msvcr71.dll!malloc(unsigned int size=124)  Line 54 + 0xf	C
&gt;	perl517.dll!VMem::Malloc(unsigned int size=112)  Line 151 + 0xe	C
 	perl517.dll!PerlMemMalloc(IPerlMem * piPerl=0x0034815c, unsigned int size=112)  Line 299 + 0x14	C
 	perl517.dll!Perl_safesysmalloc(unsigned int size=112)  Line 92	C
 	perl517.dll!Perl_av_extend_guts(interpreter * my_perl=0x00342c14, av * av=0x00ad15b4, long key=28, int * maxp=0x00a30a24, sv * * * allocp=0x00b06280, sv * * * arrayp=0x00ad15c0)  Line 163 + 0x31	C
 	perl517.dll!Perl_av_extend(interpreter * my_perl=0x00342c14, av * av=0x00000000, long key=12)  Line 83 + 0x17	C
 	perl517.dll!Perl_sv_add_backref(interpreter * my_perl=0x00342c14, sv * const tsv=0x00ad1654, sv * const sv=0x00ad6784)  Line 5627 + 0xb	C
 	perl517.dll!Perl_gv_init_pvn(interpreter * my_perl=0x00342c14, gv * gv=0x00ad6784, hv * stash=0x00ad1654, const char * name=0x280d592c, unsigned int len=7, unsigned long flags=2)  Line 382 + 0x8	C
 	perl517.dll!Perl_gv_fetchmeth_pvn(interpreter * my_perl=0x00342c14, hv * stash=0x00000007, const char * name=0x280d592c, unsigned int len=7, long level=0, unsigned long flags=0)  Line 692 + 0x16	C
 	perl517.dll!Perl_gv_fetchmeth_pvn_autoload(interpreter * my_perl=0x00342c14, hv * stash=0x00ad1654, const char * name=0x280d592c, unsigned int len=7, long level=0, unsigned long flags=0)  Line 857	C
 	perl517.dll!S_curse(interpreter * my_perl=0x00b06280, sv * const sv=0x00ad0c64, const char check_refcnt='&#x2401;')  Line 6446 + 0x12	C
 	perl517.dll!Perl_sv_clear(interpreter * my_perl=0x00342c14, sv * const orig_sv=0x00ad0c64)  Line 6117 + 0xb	C
 	perl517.dll!Perl_sv_free2(interpreter * my_perl=0x00342c14, sv * const sv=0x00ad0c64, const unsigned long rc=1)  Line 6584	C
 	perl517.dll!S_SvREFCNT_dec(interpreter * my_perl=0x00342c14, sv * sv=0x00000020)  Line 62 + 0xb	C
 	perl517.dll!do_clean_objs(interpreter * my_perl=0x00342c14, sv * const ref=0x00ad0a04)  Line 480 + 0x13	C
 	perl517.dll!S_visit(interpreter * my_perl=0x00342c14, void (interpreter *, sv *)* f=0x28083583, const unsigned long flags=2048, const unsigned long mask=2048)  Line 423	C
 	perl517.dll!Perl_sv_clean_objs(interpreter * my_perl=0x00b06280)  Line 581	C
 	perl517.dll!perl_destruct(interpreter * my_perl=0x00342c14)  Line 772	C
 	perl517.dll!RunPerl(int argc=2, char * * argv=0x01345c98, char * * env=0x00343f70)  Line 275	C
 	perl.exe!mainCRTStartup()  Line 398 + 0xe	C
 	kernel32.dll!_BaseProcessStart@4()  + 0x23	
&lt;/code&gt;
It tried to read point 0x25.
Another crash
&lt;code&gt;
&gt;	Unicode.dll!enc_pack(interpreter * my_perl=0x00342c14, sv * result=0x00000000, unsigned int size=11343540, unsigned char endian='V', unsigned long value=86)  Line 104	C
 	Unicode.dll!XS_Encode__Unicode_encode_xs(interpreter * my_perl=0x00342c00, cv * cv=0x00ad0d34)  Line 378 + 0x16	C
 	perl517.dll!Perl_pp_entersub(interpreter * my_perl=0x00000002)  Line 2877	C
 	perl517.dll!Perl_runops_standard(interpreter * my_perl=0x00342c14)  Line 42 + 0x4	C
 	perl517.dll!S_run_body(interpreter * my_perl=0x00000004, long oldscope=1)  Line 2430 + 0xa	C
 	perl517.dll!perl_run(interpreter * my_perl=0x00342c14)  Line 2346 + 0x8	C
 	perl517.dll!RunPerl(int argc=2, char * * argv=0x01345c98, char * * env=0x00343f70)  Line 270 + 0x6	C
 	perl.exe!mainCRTStartup()  Line 398 + 0xe	C
 	kernel32.dll!_BaseProcessStart@4()  + 0x23	
&lt;/code&gt;</field>
</data>
</node>
