http://www.perlmonks.org?node_id=870005

Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Hello Monks, I need a hand modifying this bit of code:  s/"\s+", " "//g; such that it leaves 2 newline characters found in succession. Thank you for your help, Jules

Replies are listed 'Best First'.
Re: Skip 2 new lines
by GrandFather (Saint) on Nov 08, 2010 at 04:31 UTC

    Maybe you should show us a little sample because what you have shown is not consistent with the problem you describe (or at least not my understanding of it). As given the regex doesn't alter new line characters except in the unlikely context where \s+ is in your expression. That seems an unusual place to be worried about preserving paragraph breaks.

    However, if indeed you want to preserve paragraphs where the \s+ is, the following may turn the trick for you:

    s/"(?:(?<!\s)(?!\n\n)\s)+", " "//g;
    True laziness is hard work
Re: Skip 2 new lines
by CountZero (Bishop) on Nov 08, 2010 at 07:37 UTC
    A 100% regex-free solution:
    use strict; use warnings; use 5.012; while (<DATA>) { chomp; print "$_ "; say unless $_; } __DATA__ this is a line which is part of a sentence. And this is the second sentence in the target paragraph. This should be the start of a new paragraph ....
    Output:
    this is a line which is part of a sentence. And this is the second sen +tence in the target paragraph. This should be the start of a new paragraph ....

    CountZero

    A program should be light and agile, its subroutines connected like a string of pearls. The spirit and intent of the program should be retained throughout. There should be neither too little or too much, neither needless loops nor useless variables, neither lack of structure nor overwhelming rigidity." - The Tao of Programming, 4.1 - Geoffrey James

Re: Skip 2 new lines
by mjscott2702 (Pilgrim) on Nov 08, 2010 at 08:43 UTC
    If what you are trying to do is replace 2 consecutive whitespace characters apart from 2 consecutive newlines (not sure from that substitute command, especially with the double-quotes), you could try something like:

    s/\s[\ \t\r\f]/ /g

    i.e. any whitespace character (including newline) followed by any whitespace character (except newline)

      Thank you to everyone - very helpful.
Re: Skip 2 new lines
by murugu (Curate) on Nov 08, 2010 at 02:20 UTC
    Hi,

    Assuming that you are going to read a file,

    $/="\n\n";

    Regards,
    Murugesan Kandasamy
    use perl for(;;);

      Hi Murugu,

      Thanks for the feedback. This didn't quite do what I was hoping... :)

      In my file I have '2' paragraphs that I would like to create, but each section of text that I want to make a paragraph out of has a newline every 4th word - the only thing that separates each 'section'/'to be paragraph' is 2 newlines that I would like to have maintained in the newly created file.

      Right now my code just makes one paragraph out of the 2 sections.

      Thanks again.

Re: Skip 2 new lines
by ww (Archbishop) on Nov 08, 2010 at 04:23 UTC
    Instead of thinking in terms of modifying (cargo-culted?) code, you'll probably do better thinking through exactly what has to happen, given the data you failed to include, initially (but have, in effect, provided in your first reply).

    For example, assuming your description means data like this:

    this is a line which is part of a sentence. And this is the second sentence in the target paragraph. This should be the start of a new paragraph ....

    Then it's pretty obvious that you should remove any instance of \n that's not followed immediately by another... but leave any instance of \n\n alone.

    And having performed that analysis, you can look for methods of implementing such an algorithm.

    Hint: you'll find one answer in the discussion of regexen which "look ahead" in the documentation for regular expressions -- perldoc perlretut.

    Another approach, since you say you're working from a file, would set $/ to match the double newline. You can read about that in any number of posts (easily found usings Super Search or in perlvar).

    Update: Oops. brain-lock alert! I should have noted (above) that murugu's response is a valid suggestion... and also that AnonyMonk should not presume that pointers like that are completed code. Since AM posted no code to illustrate the statement "This didn't quite do what I was hoping..." it's hard to tell where the problem lies.

    So, Anonymous Monk, please read On asking for help, How do I post a question effectively? and I know what I mean. Why don't you? for some more suggestions; suggestions about how to help us to help you.

Re: Skip 2 new lines
by aquarium (Curate) on Nov 08, 2010 at 04:42 UTC
    and yet another approach = first replace all \n\n with a special character or character sequence that won't appear in normal text. then do whatever you want to do with single \n's and whatever other processing. then finally expand out the special character or sequence back to \n\n. this is a bit long winded, but doesn't fall into the traps of complex regular expressions that can trip up when you're not looking.
    the hardest line to type correctly is: stty erase ^H