Beefy Boxes and Bandwidth Generously Provided by pair Networks BBQ
We don't bite newbies here... much
 
PerlMonks  

Re: Re: Quick and portable way to determine line-ending string?

by bikeNomad (Priest)
on Aug 09, 2001 at 02:59 UTC ( #103273=note: print w/ replies, xml ) Need Help??


in reply to Re: Quick and portable way to determine line-ending string?
in thread Quick and portable way to determine line-ending string?

Sadly, one can't assume that. For instance, I have seen a number of cases where a supposedly-text file from a Unix system has been edited on a MS-DOSish system and hence contains extra "\x0d" characters.

Then, of course, there's the question of what to do on EBCDIC systems, where the line endings are likely to be something entirely different.


Comment on Re: Re: Quick and portable way to determine line-ending string?
Re: Re: Re: Quick and portable way to determine line-ending string?
by nardo (Friar) on Aug 09, 2001 at 03:38 UTC
    If the file contains 0x0d characters then they are characters which are supposed to be part of the line seperator and will be caught by the code I wrote. While many unix tools will see the 0x0d as just another character, if you do infact have a file which has 0x0d 0x0a pairs, you probably want 0x0d 0x0a to be your line seperator. If you had mixed 0x0d 0x0a and just 0x0a line seperators then the code won't work but so long as it is consistent it should be fine.
(tye)Re: Quick and portable way to determine line-ending string?
by tye (Cardinal) on Aug 09, 2001 at 05:12 UTC

    On an EBCDIC system, the line endings are probably "\r" and "\n", of course. And there is no point in using "\x0a" and "\x0c" in the previous code. The only use for "\x0a" and "\x0d" are when you might run under MacOS and are using something like a network protocol that requires "\r\n". MacOS made the mistake of changing the definition of "\r" and "\n" rather than translating them. All other system that use non-Unix line endings, _translate_ to/from "\n".

    If it weren't for MacOS, "\r" and "\n" would always be the right choice. The move toward "\x0a" and "\x0c" has been motivated by trying to be portable with MacOS and has caused great confusion. Since very few Perl programmers actually work on the even weirder systems like those that use EBCDIC, the folly of this has not been widely noted (CGI.pm is one of the few places that I've seen start to notice this).

            - tye (but my friends call me "Tye")
Re: Re: Re: Quick and portable way to determine line-ending string?
by John M. Dlugosz (Monsignor) on Aug 09, 2001 at 07:10 UTC
    And Unicode files that use the new linebreak/parabreak characters! Say it's a UTF-8 encoded file... no 0x0A in sight!

    Like Perl itself, you need to be leniant about reading linebreaks. But you need to know the proper form for writing them.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://103273]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others having an uproarious good time at the Monastery: (9)
As of 2014-04-17 10:58 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    April first is:







    Results (444 votes), past polls