Beefy Boxes and Bandwidth Generously Provided by pair Networks
The stupid question is the question not asked
 
PerlMonks  

regular expression: match multiple newlines

by wegelin (Initiate)
on Nov 03, 2013 at 19:25 UTC ( #1061063=perlquestion: print w/ replies, xml ) Need Help??
wegelin has asked for the wisdom of the Perl Monks concerning the following question:

I work at the unix command line (bash shell) on a Mac 10.6.8. In a text file, I want to replace all multiple newlines, even if the newlines have other whitespace between them, with EEEEE. All single newlines I want to leave alone. Thus the following file, called textfile:
dogs rats
cats

fish
must be transformed into
dogs rats
catsEEEEEfish
But as you see from the example below, the following regular expression, issued at the command line, doesn't do it.
s/\n\s*\n/EEEEE/g
Here is the example:
> cat textfile
dogs rats
cats

fish
> perl -p -e 's/\n\s*\n/EEEEE/g' textfile
dogs rats
cats

fish

Is there a simple or elegant solution?

Comment on regular expression: match multiple newlines
Download Code
Re: regular expression: match multiple newlines
by Kenosis (Priest) on Nov 03, 2013 at 19:38 UTC

    Try:

    s/\n{2,}/EEEEE/g

    Output on your dataset:

    dogs rats catsEEEEEfish

    The \n{2,} notation matches 2+ newlines.

    Edit: My apologies. Didn't notice the the OP's mentioning the possibility of other whitespace between the newlines.

Re: regular expression: match multiple newlines
by Cristoforo (Deacon) on Nov 03, 2013 at 19:42 UTC
    Your code is reading 1 line at a time. You need to 'slurp' the file and apply the regular expression. perl -0 -p -e 's/\n\s*\n/EEEEE/g' textfile

      Good point! However, \n\s*\n would replace \n\x20\n --a space surrounded by two newlines--which is not two consecutive newlines, thus the prior \n{2,} suggestion.

      Edit: My apologies. Didn't notice the the OP's mentioning the possibility of other whitespace between the newlines.

        He stated that there might be spaces between the newlines :-)
Re: regular expression: match multiple newlines
by Lennotoecom (Pilgrim) on Nov 04, 2013 at 01:22 UTC
    /(?<=\S)\s*$/ ? ({print "$a$`"},$a="\n") : ($a='EEEEE') while <DATA>; __DATA__ dogs rats cats fish text cats2 fish2 text3
    output:
    dogs rats catsEEEEEfish text cats2EEEEEfish2 text3

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://1061063]
Front-paged by Arunbear
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others rifling through the Monastery: (9)
As of 2014-08-20 21:06 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    The best computer themed movie is:











    Results (124 votes), past polls