|Perl: the Markov chain saw|
Removing multiple newlines from a line using regexby fbicknel (Sexton)
|on Jul 18, 2011 at 22:38 UTC||Need Help??|
I sought, but did not find; so I came up with my own solution. Say you have a $tring that holds a multi-line (yet short) config file. You may have obtained such a string with:
Enyway, you want to join some lines that are delimited by some pair of characters, such as (parens). Thus a line looking thus:
Why? You may want to gather all the baz info in a separate r.e. session following this one. I'm not going to talk about that step here, just the step that gathers the multiple lines into a single line.
Here's my solution. I offer it in case it's a good one:  - removed deprecated \1 notation in lieu of $1 (but see additional suggestion without loop from jwkrahn, below).
The while loop capitalizes on the fact that the s/// inside will continue to return a non-zero result until it can find no more lines to join. The r.e. looks for an open paren, \(, followed by anything but a close paren, [^)], and zero or more of these, * . If found, that part becomes group 1 ($1 later). That must be followed by a newline, \n, then any characters preceding a closing paren, \), which becomes group 2 ($2 later). If we find all that, substitute it with group 1 and group 2 without the intervening newline ($1$2).
The modifiers /sg tell Perl to look across newlines (/s) and to do this as many times as is found in the string (/g). Note that it will only happen once per find, per time through the loop. Thus, if you have several formations with the form above (see baz example), it will remove one newline from each of those several formations in the string. But it only removes one newline from each formation with each iteration through the while loop.
I hope this will be helpful to someone in the future. If you know a better way, feel free to chime in.