Beefy Boxes and Bandwidth Generously Provided by pair Networks
The stupid question is the question not asked
 
PerlMonks  

regexp for scalar containing a mixture of LF & CRLF

by NateTut (Deacon)
on Dec 09, 2004 at 22:52 UTC ( [id://413710]=perlquestion: print w/replies, xml ) Need Help??

NateTut has asked for the wisdom of the Perl Monks concerning the following question:

Monks; I have a scalar containing text with a mixture of line ending characters (it's Lotus Notes e-mail message text). I want to replace all of the LFs to be CRLFs.
I have tried many different regexp approaches. Below is the relevant snippet:
open (TEXTFILE, ">$subdir/message.txt") or die "Can't create $subdir message file: $!"; print TEXTFILE "From: ", $doc->{From}->[0]; print TEXTFILE "Subject: ", $doc->{Subject}->[0]; $doc->{Body} =~ s/\015/\n/gm; print TEXTFILE $doc->{Body}; close TEXTFILE;
$doc->{Body} contains the text with the mixture of LFs & CRLFs. This is a WinDoze app so I'm sure print is putting CRLFs on the line ends.
In the s/ I've tried everything I can think of: \x0D instead of \015 s instead of m at the end, etc. I still end up with extranneous LFs left. I think it's a problem of my understanding of the concept of a line in a regexp or something.

Doug

***Update*** here's how I finally got it to work.
open (TEXTFILE, ">$subdir/message.txt") or die "Can't create $subdir message file: $!"; print TEXTFILE "From: ", $doc->{From}->[0]; print TEXTFILE "Subject: ", $doc->{Subject}->[0]; my $NewBody = $doc->{Body}; $NewBody =~ s/\x0D\n/\n/g; print TEXTFILE $NewBody; close TEXTFILE;
The problem was that I was trying to modify a Lotus Notes Object. Kudos to diotalevi for that tip.

Doug Doug

Replies are listed 'Best First'.
Re: regexp for scalar containing a mixture of LF & CRLF
by diotalevi (Canon) on Dec 09, 2004 at 23:12 UTC
      You hit the nail right on the head. Once I made a copy of the object the regexp started working.

      Thanks!

      Doug
        I goofed by calling it a Notes::OLE object. You'd have a Win32::OLE object. I wrote Notes::OLE and the names just transposed themselves in my fingers while I typed to you.
Re: regexp for scalar containing a mixture of LF & CRLF
by Eimi Metamorphoumai (Deacon) on Dec 09, 2004 at 23:05 UTC
    I think what you want is
    s/(?<!\r)\n/\r\n/;
    The problem with your code is that you're trying to replace a single CR with an LF, so if you have an LF alone, it's unchanged. If you start with CRLF, you end up with LFLF, which isn't what you want. What you need to look for is an LF not preceded by CR, and replace it with CRLF.

    Update: It appears I spoke a little too soon, and with too much Unix-centrism. The text was fine, but I forgot that "\n" in Perl changes meaning under different platforms. What you want is

    s/(?<!\015)\012/\015\012/g;
      Thanks, but nope. See my next post for my ugly answer that finally worked.

      Doug
Re: regexp for scalar containing a mixture of LF & CRLF
by Paladin (Vicar) on Dec 09, 2004 at 23:03 UTC
    What about something like:
    s/\015?\012/\n/g;
    Which will replace all LF and CRLF to whatever your local line ending character is.

    Note: The /m modifier only affects the ^ and $ anchors, so is doing nothing in your RE since you have neither. The /s only affects what the . meta-character matches, and again, since you aren't using it, does nothing.

      Thanks but it didn't work. See my next post for the ugly way I finally got it to work.

      Doug

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://413710]
Approved by BrowserUk
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others wandering the Monastery: (4)
As of 2024-04-16 18:59 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found