http://www.perlmonks.org?node_id=214488

mobiGeek has asked for the wisdom of the Perl Monks concerning the following question:

I need to process a text file but cannot be sure ahead of time if it's a DOS file (\r\n) or a Unix file (\n). I want to make changes to the file (add/del/modify lines), and want to preserve its line-endedness.

One way is to detect line-endedness

$first_line =~ m@((\r)?\n)@ and $NL = $1;
and then use $NL for any newlines, but I wonder if there isn't a niftier, Perlish way?

Replies are listed 'Best First'.
Re: Line endedness
by rinceWind (Monsignor) on Nov 20, 2002 at 16:02 UTC
    The requirements implied in your question are slightly unclear.
    I need to process a text file but cannot be sure ahead of time if it's a DOS file (\r\n) or a Unix file (\n).
    What platform is your script running on, and how is it getting access to the file?

    If the file is a native file, "\n" will be the line terminator. If the file is being accessed over NFS or Samba, this software performs mapping of line terminators if the mount has been set up correctly.

    However, there are instances when the file has been transferred by some other means, and you want to respect the termination. See perldoc perlvar on $INPUT_RECORD_SEPARATOR, $/, for how to change the default action of the <> operator, and what kind of a newline it is expecting.

    Hope this helps,

    --rW

    Update: Just thought of a few other tips given what you are trying to do. You probably want to consider the output side as well, i.e. $OUTPUT_RECORD_SEPARATOR, $\. You might need a binmode on the file.

    The last time I was doing something similar, I found perldoc perlport invaluable.

      The question is unclear because the situation is unclear ;-)

      I am looking for a general answer, but the specific case is: running Perl in Cygwin on Windows. The environment is set to UNIX, but the files I need to edit are being munged by our version control system (P4). Some files remain UNIX-ified, but most are being bastar...uh...DOS-ified.

      In order to keep the "diff" output clean (p4 diff does not have a "-b" flag...), I need to handle the file however it is given to me.

      Thanks for the $/ pointer. I haven't looked at that in a very long time (I try to live in sane universes when I can...but now I'm being dragged into Windows again).