Beefy Boxes and Bandwidth Generously Provided by pair Networks
laziness, impatience, and hubris
 
PerlMonks  

Re: Re: Handling Mac, Unix, Win/DOS newlines at readtime...

by bart (Canon)
on Sep 16, 2002 at 23:44 UTC ( #198401=note: print w/replies, xml ) Need Help??


in reply to Re: Handling Mac, Unix, Win/DOS newlines at readtime...
in thread Handling Mac, Unix, Win/DOS newlines at readtime...

I would split on /\r\n?/ instead. That avoids removing blank lines.
But not on a Mac. On a Mac, the meaning of "\n" and "\r" got reversed. "\n" is what you use as native end-of-line characters, remember? And on a Mac, that's chr(13).

Also, as people tend to forget to upload their HTML as text, you often get sequences of two CR characters and one LF. You want to deal with that, too. So here's my solution:

/\015\015?\012|\015|\012/
which you might want to replace with "\n" using s///g, instead of splitting on it, so you get one cleaned up string, to feed into HTML::Parser or similar.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://198401]
help
Chatterbox?
and all is quiet...

How do I use this? | Other CB clients
Other Users?
Others perusing the Monastery: (4)
As of 2018-07-17 03:57 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    It has been suggested to rename Perl 6 in order to boost its marketing potential. Which name would you prefer?















    Results (354 votes). Check out past polls.

    Notices?