Beefy Boxes and Bandwidth Generously Provided by pair Networks
Welcome to the Monastery
 
PerlMonks  

Extract Multiple Lines from the End of a multi-line string

by NateTut (Deacon)
on Oct 16, 2008 at 20:47 UTC ( [id://717586]=perlquestion: print w/replies, xml ) Need Help??

NateTut has asked for the wisdom of the Perl Monks concerning the following question:

There has got to be a better way to do the following than this:
use strict; use warnings; sub Last_N_Lines { my($String, $N) = @_; my @Lines = split(/\n/, $String); if(scalar(@Lines) > $N) { splice(@Lines, 0, scalar(@Lines) - $N); } return(join("\n", @Lines)) } my $String = "This is line 1 This is Line 2 This is Line 3 This is Line 4 This is Line 5 This is Line 6 This is Line 7 This is Line 8 This is Line 9 This is Line 10"; print("Last\[5\] Lines.\[\n" . Last_N_Lines($tring, 5) . "\]\n");
This works fine and produces this output:
Last[5] Lines.[ This is Line 6 This is Line 7 This is Line 8 This is Line 9 This is Line 10]
I struggled with various regex ways of getting the same result, but it was beyond me. Google and Super Searches were not helpful either.

Replies are listed 'Best First'.
Re: Extract Multiple Lines from the End of a multi-line string
by GrandFather (Saint) on Oct 16, 2008 at 21:37 UTC
    sub Last_N_Lines { my ($str, $n) = @_; return join "\n", grep {defined} (split "\n", $str)[-$n .. -1]; }

    Perl reduces RSI - it saves typing
Re: Extract Multiple Lines from the End of a multi-line string
by ikegami (Patriarch) on Oct 16, 2008 at 20:54 UTC
    How about
    sub Last_N_Lines { my($String, $N) = @_; my $re = '^' . ( '.*\\n' x ($N-1) ) . '(?:.*\\n|.+)'; return ( $String =~ /($re)\z/ )[0]; }

    or

    sub Last_N_Lines { my($String, $N) = @_; --$N; return ( $String =~ /^( (?:.*\n){$N} (?:.*\n|.+) )\z/mx )[0]; }

    Update: Added alternative.
    Update: Added leading anchor to reduce backtracking.

      It works great but I'm not sure I understand completely.
      '^' . ( '.*\\n' x ($N-1) ) .
      This matches the penultimate lines.
      '(?:.+|.*\\n)'
      I'm a little fuzzier on this bit. ?: is for a non-captured group I think, the .*\\n must be the last line especially when used with the \z but what is the +| for? and also the [0] on the end of the return?

        I defined a line as either "zero or more non-newline characters followed by a newline (/.*\n/)" or "one or more non-newline characters (/.+/)". That way,

        • "foo\nbar\n" is considered to have two lines ("foo\n" and "bar\n")
        • "foo\nbar" is considered to have two lines ("foo\n" and "bar")

        That's the same behaviour as <FH>.

        If I had used /.*\n?/ instead of /.*\n|.+/, "foo\nbar\n" would have been considered to have three lines ("foo\n", "bar\n" and ""). Always be wary of patterns that can match zero characters.

        Oops, missed the second question.

        and also the [0] on the end of the return?

        In list context, the match operator returns what it captured. I used a list slice to force list context.

        $x = 'abc' =~ /a(.)c/; # 1 (or false on fail) @x = 'abc' =~ /a(.)c/; # b (or () on fail) $x = ( 'abc' =~ /a(.)c/ )[0]; # b (or undef on fail)
        I could also have used
        $x = 'abc' =~ /a(.)c/ && $1; # b (or false on fail)
Re: Extract Multiple Lines from the End of a multi-line string
by toolic (Bishop) on Oct 16, 2008 at 21:19 UTC
    Different? Yes. Better? You decide.
    use strict; use warnings; my $String = "This is line 1 This is Line 2 This is Line 3 This is Line 4 This is Line 5 This is Line 6 This is Line 7 This is Line 8 This is Line 9 This is Line 10"; print "Last [ 5] Lines.[\n" . Last_N_Lines($String, 5) . "]\n\n"; print "Last [12] Lines.[\n" . Last_N_Lines($String, 12) . "]\n"; sub Last_N_Lines { my ($String, $N) = @_; my $count = ($String =~ tr/\n//); return $String if $count < $N; my @Lines = (reverse split /\n/, $String)[0..$N-1]; return (join "\n", reverse @Lines); } __END__ Last [ 5] Lines.[ This is Line 6 This is Line 7 This is Line 8 This is Line 9 This is Line 10] Last [12] Lines.[ This is line 1 This is Line 2 This is Line 3 This is Line 4 This is Line 5 This is Line 6 This is Line 7 This is Line 8 This is Line 9 This is Line 10]
Re: Extract Multiple Lines from the End of a multi-line string
by monarch (Priest) on Oct 17, 2008 at 02:57 UTC
    Here is my regexp approach - note that I have two regexps - one for a single line match, and one for a multiple line match:
    use strict; my $re_single = qr/ (?:\A|[\r\n]) # either the start of text or newline ( [^\r\n]* # any non-newlines (?:\z|(?:\r\n|\n\r|\r|\n)\z) # last newline or eof ) /xs; sub Last_N_Lines { my ( $str, $n ) = @_; my $nless = ( $n - 1 ); my $re_multiple = qr/ (?:\A|[\r\n]) # either the start of text or newline ( (?: [^\r\n]* (?:\r\n|\n\r|\r|\n) ){0,$nless} ) # have to enclose above so it all goes into $1 ( [^\r\n]* (?:\z|(?:\r\n|\n\r|\r|\n)\z) # last nl or eof ) /xs; if ( $n > 1 ) { if ( $str =~ m/$re_multiple/ ) { return( "$1$2" ); } else { return( "" ); } } if ( $str =~ m/$re_single/ ) { return( $1 ); } else { return( "" ); } }; my $String = "This is line 1 This is Line 2 This is Line 3 This is Line 4 This is Line 5 This is Line 6 This is Line 7 This is Line 8 This is Line 9 This is Line 10"; print("Last\[5\] Lines.\[\n" . Last_N_Lines($String,5) . "\]\n");
    Outputs:
    Last[5] Lines.[ This is Line 6 This is Line 7 This is Line 8 This is Line 9 This is Line 10]

      Wow that's complex. And for what, to handle \r|\n|\r\n? Even in portable applications that's not necessary. But if you need that functionality for some other reason, converting to \r|\n|\r\n to \n first then just dealing with \n would be much much simpler. Probably faster too.

        I tend to go for ultra portability.. and qr/(?:\r\n|\n\r|\n|\r)/s will likely handle any end-of-line scenario in the modern world. The fact that the two byte sequences (like \r\n) are tested before the single byte sequences (like \n) ensures that the correct behaviour will take place.
Re: Extract Multiple Lines from the End of a multi-line string
by repellent (Priest) on Oct 17, 2008 at 16:52 UTC
    Got Perl v5.8.0+ ?
    sub Last_N_Lines { my ($String, $N) = @_; my @ret; open(my $IN, "<", \$String); while (<$IN>) { push(@ret, $_); shift(@ret) if $. > $N } return join "", @ret; }

    Updated: Thanks, ikegami!
      That doesn't print the last N lines. That prints lines [N+1 .. last]

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://717586]
Approved by ikegami
Front-paged by Narveson
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others wandering the Monastery: (3)
As of 2024-04-19 20:15 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found