Beefy Boxes and Bandwidth Generously Provided by pair Networks DiBona
Perl-Sensitive Sunglasses
 
PerlMonks  

Comment on

( #3333=superdoc: print w/ replies, xml ) Need Help??
As of today using Strawberry Perl 5.16.0.1 (64bit), Perl doesn't automatically adapt chomp at all to the platform nor converts automatically the carriage return from files that are read.

But Perl doesn’t work like that. As ikegami explained in his post up-thread, removal of the carriage return (CR) character occurs before the string is ever fed to chomp. See the section Defaults and how to override them in PerlIO. On Windows platforms the IO layers default to unix (the lowest level layer) plus a crlf layer on top of this. This crlf is:

A layer that implements DOS/Windows like CRLF line endings. On read converts pairs of CR,LF to a single "\n" newline character. On write converts each "\n" to a CR,LF pair.

So, it is the crlf layer that converts each literal 0d0a to Perl’s logical newline "\n" (0a), and this happens as each line of the file is read in.

The input record separator, $/, defaults to "\n" (0a), Perl’s logical newline character, regardless of the platform. And chomp($string) removes from the end of $string the character(s) (if any) currently assigned to $/. Which all works out fine, because on Windows each CRLF has already been converted into a LF by the crlf IO layer before chomp comes into play.

Now, I can’t test Strawberry Perl 5.16.0.1 (64bit) as I don’t have a 64-bit OS. However, if the latest 64-bit version of Strawberry Perl does handle CRs incorrectly, it’s likely the difference is due to a change in the default IO layers. It’s unlikely to be in any way related to the implementation of the chomp function, for the reasons outlined above. I can confirm that the 32-bit versions of Strawberry Perl 5.14.2 and 5.16.0 (64int) treat carriage returns identically when reading in a text file under Vista.

Of course, the above applies only to text files. If binmode is applied to a file handle before it is read, conversion is suppressed, CRs are retained, and chomp has no effect on these CRs (unless $/ has been re-assigned).

Update: Try specifying the input layer explicitly:

open(my $fh, '<:crlf', $filename) or die ...

Athanasius <°(((><contra mundum


In reply to Re^4: chomp() is confusing by Athanasius
in thread Why chomp() is not considering carriage-return by jesuashok

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • Outside of code tags, you may need to use entities for some characters:
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.
  • Log In?
    Username:
    Password:

    What's my password?
    Create A New User
    Chatterbox?
    and the web crawler heard nothing...

    How do I use this? | Other CB clients
    Other Users?
    Others exploiting the Monastery: (14)
    As of 2014-04-23 09:21 GMT
    Sections?
    Information?
    Find Nodes?
    Leftovers?
      Voting Booth?

      April first is:







      Results (541 votes), past polls