<?xml version="1.0" encoding="windows-1252"?>
<node id="985845" title="Substring giving strange results on $1 with utf8" created="2012-08-06 18:01:49" updated="2012-08-06 18:01:49">
<type id="115">
perlquestion</type>
<author id="832495">
choroba</author>
<data>
<field name="doctext">
Most honourable brothers and sisters in Perl. At work, we are having issues with [doc://substr]. It returns strange values when called on the special variable &lt;c&gt;$1&lt;/c&gt;.&lt;br&gt;
 The following code creates the test script and runs it:
&lt;c&gt;
use strict;
use warnings;

open my $PL, '&gt;', 'utf2.pl' or die $!;
print {$PL} &lt;&lt; '__PL__';
##########################################################
use strict;
use warnings;

binmode STDOUT, ':utf8';
binmode STDIN, ':utf8';

while (my $line = &lt;&gt;) {
    if (my ($word) = $line =~ /^(.+)$/) {
        my $one   = substr($1,    0, 1);    # doesn't work
        my $w_one = substr($word, 0, 1);    # works
        print "'$one' = '$w_one'\tat $line" unless $one eq $w_one;
    }
}
##########################################################
__PL__

open my $OUT1, '&gt;', 'utf1' or die $!;
print {$OUT1} map chr hex, qw/61 61 c5 99 0a c4 8d 0a 61 61 c5 99 0a/;
close $OUT1;
open my $OUT2, '&gt;', 'utf2' or die $!;
print {$OUT2} map chr hex, qw/c4 8d 0a 61 61 c5 99 0a c4 8d 0a/;
close $OUT2;

system "$^X utf2.pl &lt; utf1";
print "\n";
system "$^X utf2.pl &lt; utf2";
&lt;/c&gt;
The output is (tested in blead 5.17.3, on x86_64-linux-thread-multi):
&lt;pre&gt;'&amp;#65533;' = '&amp;#269;'       at &amp;#269;

'aa&amp;#345;' = 'a'     at aa&amp;#345;
'&amp;#269;&amp;#65533;' = '&amp;#269;'      at &amp;#269;
&lt;/pre&gt;

Do you have any explanation? Should I submit a bugreport?
&lt;p&gt;Update: Thanks all, [https://rt.perl.org/rt3/Public/Bug/Display.html?id=114410|bugreport] sent.

&lt;!-- Node text goes above. Div tags should contain sig only --&gt;
&lt;div class="pmsig"&gt;&lt;div class="pmsig-832495"&gt;
&amp;#1604;&amp;#1405;&amp;#4285;† &amp;#6514;&amp;#6482;&amp;#9898;&amp;#10186;&amp;#4281;&amp;#5651;&amp;#5511; &amp;#5045;&amp;#5651;&amp;#5511;&amp;#9137; &amp;#4281;&amp;#6514;&amp;#119243;&amp;#409;&amp;#5651;&amp;#5511;
&lt;/div&gt;&lt;/div&gt;</field>
</data>
</node>
