http://www.perlmonks.org?node_id=427108

bobf has asked for the wisdom of the Perl Monks concerning the following question:

I wrote a program that is being run on different platforms (Linux, Windows, and Mac). I tried to write it to be as system-independent as I could (using File::Spec for paths, etc), but recently someone reported a bug. It turns out that she was creating one of the input files on a Mac, then transferring it to a Windows machine and running the program (I didn't think of that...). The error occurred when the program tried to read the input file line-by-line. I presume that since the program was run on a Windows machine, the input record separator ($/) was set to the Windows newline (\015\012). The input file was created on a Mac, though, so it had newlines of \015. As a result, the file got slurped and things turned ugly.

Now I'm trying to figure out how to handle this situation. I reread perlport, as well as 3 questions... 2 about newlines, and one on how to be NICE and Line Feeds, but they all seem to address writing specific newline characters to files, not reading them.

Here is what I came up with so far:

  1. Use $^O, but if I understand it correctly that will just tell me about the system the program is running on, which (as exemplified here) is not necessarily the same as the system that created the file.
  2. Use a regex to match the newline character(s) in the file. I think this would require slurping the whole file and then doing something like if( $file =~ m/\015$/ ) (which assumes the file will end with a newline) or if( $file =~ m/\015(?!\012)/ ) (which doesn't), setting $/ according to what matched, and re-reading the file line-by-line.
  3. Preprocess the input file to convert all newline characters to the current system's newline character. I experimented a little, and I think this will work:
    $file =~ s[(\015)?\012(?!\015)][\n]g; $file =~ s[(\012)?\015(?!\012)][\n]g;

    I think this is my favorite solution, but it seems like a lot of extra overhead for each input file since the conversion only needs to occur once (assuming the input file is not then moved to another OS).

Are there better ways of handling this?

Thanks!