Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl: the Markov chain saw
 
PerlMonks  

Re: Iterator to parse multiline string with \\n terminator

by kcott (Abbot)
on Oct 06, 2013 at 06:34 UTC ( #1057125=note: print w/ replies, xml ) Need Help??


in reply to Iterator to parse multiline string with \\n terminator

G'day three18ti,

In the absence of seeing a context requiring anything more complex, I'd probably code something along these lines:

#!/usr/bin/env perl use strict; use warnings; my $re = qr{^(.*)(?<![\\])[\\]\n$}; my $line = ''; while (<DATA>) { if (/$re/) { $line .= $1; next; } $line .= $_; print $line; $line = ''; } __DATA__ Line 1 Part A \ Line 1 Part B \ Line 1 Part C Line 2 ALL Line 3 Part X \ Line 3 Part Y Line 4 END WITH BACKSLASH \\ Line 5 LAST Z

Output:

Line 1 Part A Line 1 Part B Line 1 Part C Line 2 ALL Line 3 Part X Line 3 Part Y Line 4 END WITH BACKSLASH \\ Line 5 LAST Z

That code could easily be adapted for an iterator if one is required for your application.

If you're not familiar with negative look-behind assertions ((?<!pattern)), they're documented under Look-Around Assertions in "perlre: Extended Patterns".

-- Ken


Comment on Re: Iterator to parse multiline string with \\n terminator
Select or Download Code
Re^2: Iterator to parse multiline string with \\n terminator
by three18ti (Scribe) on Oct 06, 2013 at 09:16 UTC

    Neat! Thanks for the link.

    I've been reading Higher Order Perl and was just reading the chapter on Lexers where MJD makes use of look-behind assertions. This actually helps make more sense of what I was reading.

    What is the difference between next and redo in this context? A user below had a similar solution but used redo instead of next.

      The difference is that redo does not re-evaluate the loop condition (in this case: "(<DATA>)", which fetches the next line) before evaluating the loop body again, whereas next does.

      This is why in jwkrahn's solution, the next line is fetched manually before calling redo:

      $_ .= <$fh>;

      The advantage of jwkrahn's solution with redo, is that the implicit variable $_ can be used to store the complete multiline record.

      The advantage of kcott's solution with next, is that there is only one place where the <> operator for fetching the next line is used (inside the loop condition) - but re-evaluating the the loop condition also resets $_, so in this case a custom variable needs to be declared above the loop to store the current record.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://1057125]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others meditating upon the Monastery: (6)
As of 2014-10-01 22:58 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    What is your favourite meta-syntactic variable name?














    Results (40 votes), past polls