Beefy Boxes and Bandwidth Generously Provided by pair Networks
Problems? Is your data what you think it is?
 
PerlMonks  

remove blank lines with regex

by amarceluk (Beadle)
on May 21, 2002 at 12:43 UTC ( [id://168099]=perlquestion: print w/replies, xml ) Need Help??

amarceluk has asked for the wisdom of the Perl Monks concerning the following question:

This is a very basic question. I've read all the nodes on removing blank lines, but nothing is working. In fact, the code I'm posting here worked a few days ago. I don't know what I'm doing wrong.
#!/usr/bin/perl print "What file do you want to run this program on?\n"; $TheFile=<STDIN>; chomp ($TheFile); open(FILE, $TheFile) or die "Can't open $TheFile.\n"; local $/ = undef; $lines = <FILE>; close(FILE); while ($lines =~ /\n\n/) { $lines =~ s/\n\n/\n/gms; } open(FILE, ">output.txt") or die "Can't open output.txt.\n"; print FILE $lines; close FILE;
Can someone tell me why that doesn't work, or what should work? Thanks!

Replies are listed 'Best First'.
Re: remove blank lines with regex
by Biker (Priest) on May 21, 2002 at 13:18 UTC

    Consider this:

    # Open both IN_FILE and OUT_FILE here! while(<IN_FILE>) { chomp; next unless length; print OUT_FILE "$_\n"; }

    I always come back to the following:

    Do not slurp in a file into an array. Someday that input file will be huge. And that will happen when used for
    production data. And you're on vacation. And you'll have to come in to the office during that sunny day on the beach.

    Read the input file line by line and act upon each line (here by potentially writing it to the output file.)

    Everything went worng, just as foreseen.

      Do not slurp in a file into an array. Someday that input file will be huge.

      I second that, and here is my 2-cent solution:

      perl -wnl -e 'print $_ unless /^$/' infile >outfile

      see also:

      perldoc perlrun perldoc perlre
      -- Joost downtime n. The period during which a system is error-free and immune from user input.
      Thanks, good advice. I got the "slurping files into an array" snippet from something in the Q&A on regex, but I will avoid it henceforth!
Re: remove blank lines with regex
by broquaint (Abbot) on May 21, 2002 at 12:52 UTC
    Try adding the s modifier to the regex match in the while loop. Also you might want to use $/ instead of a literal \n as it's more portable. Although this code should do the job without having to loop through the string.
    $lines =~ s{$/$/}{$/}gs;

    HTH

    _________
    broquaint

Re: remove blank lines with regex
by jeffenstein (Hermit) on May 21, 2002 at 12:59 UTC

    There isn't the s flag for the first regex. This will prevent the second from ever being run. For that matter, you can replace the entire while(){} with this regex:

    $lines =~ s/\n+/\n/gs;
      FYI
      the /s flag only matters if you have '.' in your regex.
      So, /\n\n/ will work just as good as /\n\n/s
      From perlre
      ... s Treat string as single line. That is, change "." to match any character whatsoever, even a newline, which normally it would not match. ...
      --perlplexer
Re: remove blank lines with regex
by arunhorne (Pilgrim) on May 21, 2002 at 14:46 UTC

    You could try:

    $line =~ s!$/+!$/!sg;

    Note the replacement of / with ! in the regex. This is perfectly allowed and allows te easy use of \$.

    In my opinion this is better than s/\n+/\n/ as it allows for the possibility that you line terminator $/ has been set to something else (admittedly, this must be done explictly by you). However taking this approach is a safer option from the point of view of bugs being introduced if you bury s/\n+/\n/ in a library and then start getting random errors from users on the basis they have chnaged \$. I'm not however advocating that what I just said doesn't have a counter argument that works the other way.

    Arun

      Better, yes, but broken? If $/ consists of multiple characters then it will not function as intented. Consider this as an example:
      my $foo = "Camel meme power supreme!"; my $bar = "me"; $foo =~ s/$bar+/$bar/g;
      The answer is not what you'd expect, the string remains unchanged. It seems that brackets are required so that the entire variable is repeated and not just the last character of the variable. "meme" to "me" and not "mee" to "me":
      $foo =~ s/($bar)+/$bar/g;
      So this translates back into something like this:
      $line =~ s!($/)+!$/!g;
      /s modifies ., which is not used, so ditch it. You don't need to use /s just to work on multi-line strings, but sometimes you want to.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://168099]
Approved by broquaint
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others surveying the Monastery: (3)
As of 2024-03-28 17:17 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found