http://www.perlmonks.org?node_id=264423

mikevanhoff has asked for the wisdom of the Perl Monks concerning the following question:

Can anyone tell me if there is an equivalent carriage return to "\n" that does NOT leave a "^M" in the output file, but actually starts a new line? The original file format is ANSI 837.

Replies are listed 'Best First'.
Re: Line Feeds
by halley (Prior) on Jun 09, 2003 at 19:06 UTC

    If you must specify file contents with an exact byte sequence, use literal octal or hexadecimal notations, and not the semantic names like \n or \r. The semantic names are subject to translations according to platform-specific encoding features. Binmode your filehandle to ensure no translations occur outside of your control.

    For example, binmode(OUTPUT); print OUTPUT "Hello\x0D\x0A" ensures a carriage-return and newline on all platforms. Conversely, dropping the \x0D part will ensure there's no carriage return (which appears as ^M in some editors).

    --
    [ e d @ h a l l e y . c c ]

      That is pretty misleading (or just incorrect). "\n" and "\x0A" are exactly the same thing unless you are on a non-ASCII system or an old Mac. Using "\x0A" is only an improvement when on an old Mac. On a non-ASCII system, using "\x0A" is likely to simply break things. On all other systems, using "\x0A" is identical to using "\n".

      So, there are no systems where "\x0A" is likely to be subject to fewer translations (since old Macs don't translate either character and non-ASCII systems will likely be translating the whole character set or none of it, depending on destination).

      Your second paragraph is correct if you add "on an ASCII system".

                      - tye

        Please read the writeup at binmode, as well as the bit about newlines in perlport. And yes, I believe Perl is implemented on many non-ASCII systems.

        While '\n' and '\x0A' are exactly the same thing on ASCII systems in the storage of perl scalars, that's a mouthful to say. By omission, that means that they may NOT be the same thing on disk, or via socket, or on non-ASCII systems.

        This is akin to the HTML argument between semantic <strong> and literal <b>. Semantics enforce user/platform preferences, and literals enforce author preferences.

        My advice was to use semantic names when you want semantic meanings, and use literal numerical values when being literal is important. Binmode tells Perl you care. The syntax you use tells the developer you care. Remember, source code is for the human to read, too, and using the \x0A clues the maintenance programmer that the byte values matter. I don't see how that's misleading or incorrect.

        --
        [ e d @ h a l l e y . c c ]

Re: Line Feeds
by mikevanhoff (Acolyte) on Jun 09, 2003 at 20:42 UTC
    Thanks for all of the help.