RE: Remove the ^M Character from a Document
by ZZamboni (Curate) on May 08, 2000 at 18:13 UTC
|
On a similar note, the other day I had a postscript file with
an embedded pixmap graphic which contained line breaks represented
only as "\r", in addition to the usual DOS "\r\n" at the
end of each line. The graphic with the \r's came up, in Unix,
as a single over-900K line, which broke most programs I tried
to use to manipulate the file (those programs were not written in
Perl, clearly :-)
So based on this snippet, I came up with the following one-liner:
perl -pi -e 's/\r\n?/\n/g' <i>file</i>
which solved my problem.
| [reply] [d/l] |
|
You might want to try a different substitution character, to lessen the obfuscation on this syntax, such as:
perl -pi.orig -e 's#\r\n#\n#g' filespec
or
perl -pi.orig -e 's,\r\n,\n,g' filespec
Though ideally, this is more correct:
perl -pi.orig -e 's,\cM,,g' filespec # commas for clarity
| [reply] [d/l] [select] |
|
The use of the forward slash to delimited regular expressions
and replacements has a history of decades - predating the
birth of Perl by years. There are no forward slashes in the
regular expression that could cause confusion. So, other than
a fear of forward slashes, what makes you think use of a
forward slash contributes to obfuscation?
Abigail
| [reply] |
|
|
|
Re: Remove the ^M Character from a Document
by thaigrrl (Monk) on Jan 04, 2001 at 23:15 UTC
|
Or... in solaris you can do:
dos2unix <filename> <newfilename>
| [reply] |
|
You can get a dos2unix (and unix2dos) for any flavor of UNIX/BSD.
Cheers,
KM
| [reply] |
|
| [reply] [d/l] |
RE: Remove the ^M Character from a Document
by le (Friar) on Jun 06, 2000 at 16:42 UTC
|
I have one alternative left:
the character combination ^M is created by hitting Ctrl-v Ctrl-m.
so typing:
perl -pi -e 's/Ctrl-v Ctrl-m//g' filename
will replace the annoying ^M's too.
Remember: Ctrl-v Ctrl-m is a key combination, not literal text. | [reply] [d/l] |
Re: Remove the ^M Character from a Document
by KM (Priest) on Jan 04, 2001 at 23:20 UTC
|
For those who use this for S&R beyond the scope of Ctrl chars (like words, sentences, etc...), using perl -pi.bak -e 's!something!something else!' file
is helpful so you can have a backup of your file in case something occurs which you didn't expect. Refer to perlrun.
Cheers,
KM | [reply] [d/l] |
RE: Remove the ^M Character from a Document
by Anonymous Monk on Mar 23, 2000 at 00:12 UTC
|
If you're in a Unix environment (where \n is the EOL char) you can just as easily do:
perl -pi -e 's/\r//g' <file name>
This works because in DOS, EOL is represented at \r\n. | [reply] |
|
Not sure why the ones mentioned above does not work.
but yours do :D
thanz.
Wiseness does not come with age, but
with the mind to realise...
| [reply] |
|
What if you are in a DOS environment and you want to remove what will become the offending ^M when the file is opened in Unix? Running the search and replace doesn't do anything. You still end up with the carriage return instead of the linefeed
| [reply] |
Simple Way...
by Anonymous Monk on Jan 15, 2002 at 00:44 UTC
|
vi the file and white in command mode type in:
:%s/[ctrl+v][ctrl+m]//g
hitting the ctrl+v makes the carrot and ctrl+m specifies the letter "m"
basically searches and replaces all ^M with nothing
| [reply] [d/l] |
RE: Remove the ^M Character from a Document
by muppetBoy (Pilgrim) on May 11, 2000 at 16:12 UTC
|
If you are trying to do this substitution from the command line you could just use dos2unix/unix2dos. (on Unix box - I think the commands are available on all flavours(?))
| [reply] |
|
if you need to strip it from all the *.html files in a directory, try this sh snippet:
for A in *.html; do if [ -f $A ]; then sed -e 's/^V^M//g' $A > /tmp/foo.$$; mv /tmp/foo.$$ $A; fi;done
it's not perl, so sue me :-P
edit by mirod: added code tags
| [reply] [d/l] |
RE: Remove the ^M Character from a Document
by Anonymous Monk on Mar 17, 2000 at 20:08 UTC
|
hmm...
it didn't work for me !!
| [reply] |
|
| [reply] |
|
This works for me.
$string =~ s/(\n|\r|\x0d)//g;
| [reply] |
|
If you want to do this from Windows (rather than Unix), it seems to be very hard to stop perl wanting to convert anything that looks like ascii 10 back to ascii 10 + ascii 13.
The one way I've managed to do it is to put the file handle into binmode. aka:
binmode STDOUT;
while (<>) {
s/\n//;
print "$_" . chr(10);
}
Possibly there is a variable that controls this but I haven't found it and things like $OFS in the program and -l012 on the command line don't seem to help (in perl 5.16). Possibly someone might to look into this in more detail.
| [reply] [d/l] |
Re: Remove the ^M Character from a Document
by Anonymous Monk on Feb 22, 2003 at 14:28 UTC
|
February 22, 2003:
It worked for me... however, I was executing from a web-based .PL script call and not a native command line.
-Postmaster,
www.churchsermon.org
| [reply] |
Re: Remove the ^M Character from a Document
by umasuresh (Hermit) on Oct 04, 2010 at 16:00 UTC
|
Along the same lines:
I often get these strange characters ^[[00m when I save the list command into an output file ( ls *.txt > list) which is visible only in vi. I am aware that these characters appear due to alias ls='ls --color' option in my .bashrc file. I don't want to unalias ls in each window. Is there a similar solution for fixing this? | [reply] [d/l] [select] |