Beefy Boxes and Bandwidth Generously Provided by pair Networks
"be consistent"

Comment on

( #3333=superdoc: print w/replies, xml ) Need Help??

It still matters with newer perls, too.

It's kind of a pity the patch you linked to doesn't really fix the issue it (apparently) set out to fix, i.e. the long standing bug with encodings like UTF-16 in combination with the :crlf layer.

I just checked it with 5.15.8, and I still see the same "unexpected" behavior, as it always has been. That is, when na´vely pushing a UTF-16 layer to enable UTF-16 functionality (on Windows), corrupted files are produced on writing, and carriage returns are not being removed upon reading:

--- writing ---

#!/usr/local/perl/5.15.8/bin/perl -w my $fname = "foo.utf16"; open my $out, ">:crlf:encoding(UTF-16LE)", $fname or die; print $out "\x{feff}\x{1234}\n\x{5678}\n";
$ ./ $ hexdump foo.utf16 0000000 feff 1234 0a0d 7800 0d56 000a 000000c

Wrong!  correct encoding should be:

$ hexdump foo.utf16 0000000 feff 1234 000d 000a 5678 000d 000a 000000e

--- reading ---

#!/usr/local/perl/5.15.8/bin/perl -w use Devel::Peek; my $fname = "foo.utf16"; # create correct file, using the same old layer mantra # (the extra :utf8 is only required with older perls) open my $out, ">:raw:encoding(UTF-16LE):crlf:utf8", $fname or die; print $out "\x{feff}\x{1234}\n\x{5678}\n"; close $out; # read file back in open my $in, "<:crlf:encoding(UTF-16LE)", $fname or die; $/ = undef; Dump <$in>;
$ ./ SV = PV(0x77dc60) at 0x953728 REFCNT = 1 FLAGS = (TEMP,POK,pPOK,UTF8) PV = 0x829130 "\357\273\277\341\210\264\r\n\345\231\270\r\n"\0 [UTF8 + "\x{feff}\x{1234}\r\n\x{5678}\r\n"] CUR = 13 ^ ^ LEN = 14

Wrong!  \r should've been removed.

(Note that because I tested this on Unix, I had to push :crlf myself. With a native Windows perl, the layer would of course already have been in place — i.e., you'd just say ">:encoding(UTF-16LE)" or "<:encoding(UTF-16LE)" (as anyone unaware of the issue would likely have tried).)

Personally, I think allowing another :crlf to be pushed on the stack (as it is now after the patch) is not the right approach to fix the issue, because you still have to manually rearrange the layers to get correct results.  I fail to see the benefit of being allowed to have two :crlf layers now.

In reply to Re^3: Perl Windows vs Cygwin installs by Eliya
in thread Perl Windows vs Cygwin installs by gholley0

Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":

  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.
  • Log In?

    What's my password?
    Create A New User
    and all is quiet...

    How do I use this? | Other CB clients
    Other Users?
    Others making s'mores by the fire in the courtyard of the Monastery: (4)
    As of 2018-06-22 02:29 GMT
    Find Nodes?
      Voting Booth?
      Should cpanminus be part of the standard Perl release?

      Results (121 votes). Check out past polls.