Re: Iterator to parse multiline string with \\n terminator

in reply to Iterator to parse multiline string with \\n terminator

In the absence of seeing a context requiring anything more complex, I'd probably code something along these lines:

#!/usr/bin/env perl

use strict;
use warnings;

my $re = qr{^(.*)(?<![\\])[\\]\n$};

my $line = '';

while (<DATA>) {
    if (/$re/) {
        $line .= $1;
        next;
    }

    $line .= $_;
    print $line;
    $line = '';
}

__DATA__
Line 1 Part A \
Line 1 Part B \
Line 1 Part C
Line 2 ALL
Line 3 Part X \
Line 3 Part Y
Line 4 END WITH BACKSLASH \\
Line 5 LAST Z
[download]

Output:

Line 1 Part A Line 1 Part B Line 1 Part C
Line 2 ALL
Line 3 Part X Line 3 Part Y
Line 4 END WITH BACKSLASH \\
Line 5 LAST Z
[download]

That code could easily be adapted for an iterator if one is required for your application.

If you're not familiar with negative look-behind assertions ((?<!pattern)), they're documented under Look-Around Assertions in "perlre: Extended Patterns".

-- Ken

Comment on Re: Iterator to parse multiline string with \\n terminator Select or Download Code

Replies are listed 'Best First'.
Re^2: Iterator to parse multiline string with \\n terminator by three18ti (Monk) on Oct 06, 2013 at 09:16 UTC
Neat! Thanks for the link. I've been reading Higher Order Perl and was just reading the chapter on Lexers where MJD makes use of look-behind assertions. This actually helps make more sense of what I was reading. What is the difference between next and redo in this context? A user below had a similar solution but used redo instead of next.	[reply]
Re^3: Iterator to parse multiline string with \\n terminator by smls (Friar) on Oct 06, 2013 at 11:47 UTC
The difference is that `redo` does not re-evaluate the loop condition (in this case: "`(<DATA>)`", which fetches the next line) before evaluating the loop body again, whereas `next` does. This is why in jwkrahn's solution, the next line is fetched manually before calling `redo`: `$_ .= <$fh>;` [download] The advantage of jwkrahn's solution with `redo`, is that the implicit variable `$_` can be used to store the complete multiline record. The advantage of kcott's solution with `next`, is that there is only one place where the `<>` operator for fetching the next line is used (inside the loop condition) - but re-evaluating the the loop condition also resets `$_`, so in this case a custom variable needs to be declared above the loop to store the current record.	[reply] [d/l] [select]

In Section Seekers of Perl Wisdom