Beefy Boxes and Bandwidth Generously Provided by pair Networks
good chemistry is complicated,
and a little bit messy -LW
 
PerlMonks  

Re^2: chomp() is confusing

by ikegami (Pope)
on May 23, 2006 at 16:49 UTC ( #551203=note: print w/ replies, xml ) Need Help??


in reply to Re: chomp() is confusing
in thread Why chomp() is not considering carriage-return

If you run your script on Windows, the carrage return and newline will both be removed (since it's expected that under Windows files will end in both a carriage return and a newline).

That's completely wrong. Perl automatically converts CRLF to LF when reading text, so $/ is set to LF ("\n"). Your script shouldn't see CR at all, so chomp only removes the LF.

Now, if you were dealing with binary data, chomp won't work (without changing $/) because chomp is a text function.

print(unpack('H*', $/), "\n"); # 0a Just LF my $str = <DATA>; print(unpack('H*', $str), "\n"); # 746573740a No CR chomp($str); print(unpack('H*', $str), "\n"); # 74657374 LF chomped. __DATA__ test

If, on the other hand, you want to run a script that will strip the EOL marker off a file regardless of what OS the script is running on and whether that file came from Unix or Windows, you need to do that yourself, with something like:

That regexp is incomplete. It doesn't handle systems that use \r as the EOL marker.


Comment on Re^2: chomp() is confusing
Select or Download Code
Re^3: chomp() is confusing
by aufflick (Deacon) on May 26, 2006 at 05:33 UTC
    Hmm, you might be right about the conversion - All my windows perl runs under cygwin anyway so I don't really get to see that pain and my understanding may be flawed.

    I do have to stand my ground on

    "That regexp is incomplete. It doesn't handle systems that use \r as the EOL marker."
    however - which of the two operating systems I mentioned (UNIX and Windows) use \r as the EOL marker? In fact I specifically said in the next paragraph:

    Of course the other EOL possibility is that of a Classic MacOS text file, where the EOL marker is a lone carriage return (ie. \r). Mercifully you won't come across these files nearly as much as you used to and dealing with them is an exercise left to the reader ;)

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://551203]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others examining the Monastery: (11)
As of 2014-07-14 11:07 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    When choosing user names for websites, I prefer to use:








    Results (257 votes), past polls