I wrote a program that is being run on different platforms (Linux, Windows, and Mac). I tried to write it to be as system-independent as I could (using File::Spec for paths, etc), but recently someone reported a bug. It turns out that she was creating one of the input files on a Mac, then transferring it to a Windows machine and running the program (I didn't think of that...). The error occurred when the program tried to read the input file line-by-line. I presume that since the program was run on a Windows machine, the input record separator ($/) was set to the Windows newline (\015\012). The input file was created on a Mac, though, so it had newlines of \015. As a result, the file got slurped and things turned ugly.
Now I'm trying to figure out how to handle this situation. I reread perlport, as well as 3 questions... 2 about newlines, and one on how to be NICE and Line Feeds, but they all seem to address writing specific newline characters to files, not reading them.
Here is what I came up with so far:
Use $^O, but if I understand it correctly that will just tell me about the system the program is running on, which (as exemplified here) is not necessarily the same as the system that created the file.
Use a regex to match the newline character(s) in the file. I think this would require slurping the whole file and then doing something like if( $file =~ m/\015$/ ) (which assumes the file will end with a newline) or if( $file =~ m/\015(?!\012)/ ) (which doesn't), setting $/ according to what matched, and re-reading the file line-by-line.
Preprocess the input file to convert all newline characters to the current system's newline character. I experimented a little, and I think this will work:
$file =~ s[(\015)?\012(?!\015)][\n]g;
$file =~ s[(\012)?\015(?!\012)][\n]g;
I think this is my favorite solution, but it seems like a lot of extra overhead for each input file since the conversion only needs to occur once (assuming the input file is not then moved to another OS).
Are there better ways of handling this?
Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
Read Where should I post X? if you're not absolutely sure you're posting in the right place.
Please read these before you post! —
Posts may use any of the Perl Monks Approved HTML tags:
You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
- a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
Link using PerlMonks shortcuts! What shortcuts can I use for linking?
See Writeup Formatting Tips and other pages linked from there for more info.
| & || & |
| < || < |
| > || > |
| [ || [ |
| ] || ] ||