It still matters with newer perls, too.
It's kind of a pity the patch you linked to doesn't really fix the issue it (apparently) set out to fix, i.e. the long standing bug with encodings like UTF-16 in combination with the :crlf layer.
I just checked it with 5.15.8, and I still see the same "unexpected" behavior, as it always has been. That is, when naïvely pushing a UTF-16 layer to enable UTF-16 functionality (on Windows), corrupted files are produced on writing, and carriage returns are not being removed upon reading:
--- writing ---
#!/usr/local/perl/5.15.8/bin/perl -w
my $fname = "foo.utf16";
open my $out, ">:crlf:encoding(UTF-16LE)", $fname or die;
print $out "\x{feff}\x{1234}\n\x{5678}\n";
$ ./test-out.pl
$ hexdump foo.utf16
0000000 feff 1234 0a0d 7800 0d56 000a
000000c
Wrong! correct encoding should be:
$ hexdump foo.utf16
0000000 feff 1234 000d 000a 5678 000d 000a
000000e
--- reading ---
#!/usr/local/perl/5.15.8/bin/perl -w
use Devel::Peek;
my $fname = "foo.utf16";
# create correct file, using the same old layer mantra
# (the extra :utf8 is only required with older perls)
open my $out, ">:raw:encoding(UTF-16LE):crlf:utf8", $fname or die;
print $out "\x{feff}\x{1234}\n\x{5678}\n";
close $out;
# read file back in
open my $in, "<:crlf:encoding(UTF-16LE)", $fname or die;
$/ = undef;
Dump <$in>;
$ ./test-in.pl
SV = PV(0x77dc60) at 0x953728
REFCNT = 1
FLAGS = (TEMP,POK,pPOK,UTF8)
PV = 0x829130 "\357\273\277\341\210\264\r\n\345\231\270\r\n"\0 [UTF8
+ "\x{feff}\x{1234}\r\n\x{5678}\r\n"]
CUR = 13 ^ ^
LEN = 14
Wrong! \r should've been removed.
(Note that because I tested this on Unix, I had to push :crlf myself. With a native Windows perl, the layer would of course already have been in place — i.e., you'd just say ">:encoding(UTF-16LE)" or "<:encoding(UTF-16LE)" (as anyone unaware of the issue would likely have tried).)
Personally, I think allowing another :crlf to be pushed on the stack (as it is now after the patch) is not the right approach to fix the issue, because you still have to manually rearrange the layers to get correct results. I fail to see the benefit of being allowed to have two :crlf layers now. |