Beefy Boxes and Bandwidth Generously Provided by pair Networks
Welcome to the Monastery
 
PerlMonks  

Re^4: CR-LF on UTF-16LE files on Windows

by james28909 (Deacon)
on Nov 08, 2018 at 00:04 UTC ( #1225387=note: print w/replies, xml ) Need Help??


in reply to Re^3: CR-LF on UTF-16LE files on Windows
in thread CR-LF on UTF-16LE files on Windows

Would binmode() work? I don't have any files like that at my disposal to test.

Replies are listed 'Best First'.
Re^5: CR-LF on UTF-16LE files on Windows
by ikegami (Pope) on Nov 08, 2018 at 08:00 UTC

    binmode would not work.

    When binmode applies :raw, it disables any existing :crlf layer rather than removing it. And a subsequent :crlf renables the existing :crlf layer rather than adding a new one. That means that

    binmode($fh, ':raw:encoding(UTF-16LE):crlf')

    is no different than

    binmode($fh, ':encoding(UTF-16LE)')

    It's therefore impossible to apply :encoding(UTF-16LE) to STDIN, STDOUT and STDERR on Windows (if you also want to :crlf). You'd need something like the following instead:

    open(my $fh, '<&=:raw:encoding(UTF-16le):crlf', fileno(STDIN)); *STDIN = $fh;

    (Untested)

      binmode would not work

      Is that so? I get the same (cases 3 and 4) correct result, regardless of layers stack being built through open or binmode.

      use strict; use warnings; use feature 'say'; use autodie; $, = ' '; { open my $f, '>:raw:encoding(UTF-16LE):crlf', 'test'; say $f 123; } { # 1 "pure binary slurp" open my $f, '<:raw', 'test'; undef local $/; say PerlIO::get_layers( $f ); say unpack '(H2)*', <$f>; } { # 2 OP's case open my $f, '<:encoding(UTF-16LE)', 'test'; say PerlIO::get_layers( $f ); say unpack '(H2)*', <$f>; } { # 3 correct open my $f, '<:raw:encoding(UTF-16LE):crlf', 'test'; say PerlIO::get_layers( $f ); say unpack '(H2)*', <$f>; } { # 4 correct open my $f, '<', 'test'; binmode $f, ':raw:encoding(UTF-16LE):crlf'; say PerlIO::get_layers( $f ); say unpack '(H2)*', <$f>; } { # 5 same as #2 open my $f, '<', 'test'; binmode $f, ':encoding(UTF-16LE)'; say PerlIO::get_layers( $f ); say unpack '(H2)*', <$f>; } __END__ unix crlf 31 00 32 00 33 00 0d 00 0a 00 unix crlf encoding(UTF-16LE) utf8 31 32 33 0d 0a unix crlf encoding(UTF-16LE) utf8 crlf utf8 31 32 33 0a unix crlf encoding(UTF-16LE) utf8 crlf utf8 31 32 33 0a unix crlf encoding(UTF-16LE) utf8 31 32 33 0d 0a

      But can output of PerlIO::get_layers be believed at all? There are a few utf8 (pseudo- -?) layers for which I didn't ask. Also, the bottommost crlf layer is not removed but rather disabled, in both 3 and 4 (and 1, too) cases. And not re-enabled later.

      However, I can :pop (rather than "disable") existing layers, and here open and binmode behave differently: the latter doesn't allow to go to the bottom of the stack. Don't know if these factoids are of any value though.

      { # 6 open my $f, '<:pop:pop:unix:encoding(UTF-16LE):crlf', 'test'; say PerlIO::get_layers( $f ); say unpack '(H2)*', <$f>; } { # 7 open my $f, '<', 'test'; binmode $f, ':pop:pop:unix:encoding(UTF-16LE):crlf'; say PerlIO::get_layers( $f ); say unpack '(H2)*', <$f>; } __END__ unix encoding(UTF-16LE) utf8 crlf utf8 31 32 33 0a unix encoding(UTF-16LE) utf8 crlf utf8 Use of uninitialized value in unpack at crlf.pl line 49. refcnt_dec: fd 0: 0 <= 0

        Just verified that #4 didn't used to work. Tested using 5.10.1. I don't know at what point it was fixed.

        Things may have changed. I very busy and didn't/don't have a chance to test.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://1225387]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others about the Monastery: (5)
As of 2020-03-29 19:24 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    To "Disagree to disagree" means to:









    Results (171 votes). Check out past polls.

    Notices?