Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl-Sensitive Sunglasses
 
PerlMonks  

Re: Extract Multiple Lines from the End of a multi-line string

by monarch (Priest)
on Oct 17, 2008 at 02:57 UTC ( #717644=note: print w/ replies, xml ) Need Help??


in reply to Extract Multiple Lines from the End of a multi-line string

Here is my regexp approach - note that I have two regexps - one for a single line match, and one for a multiple line match:

use strict; my $re_single = qr/ (?:\A|[\r\n]) # either the start of text or newline ( [^\r\n]* # any non-newlines (?:\z|(?:\r\n|\n\r|\r|\n)\z) # last newline or eof ) /xs; sub Last_N_Lines { my ( $str, $n ) = @_; my $nless = ( $n - 1 ); my $re_multiple = qr/ (?:\A|[\r\n]) # either the start of text or newline ( (?: [^\r\n]* (?:\r\n|\n\r|\r|\n) ){0,$nless} ) # have to enclose above so it all goes into $1 ( [^\r\n]* (?:\z|(?:\r\n|\n\r|\r|\n)\z) # last nl or eof ) /xs; if ( $n > 1 ) { if ( $str =~ m/$re_multiple/ ) { return( "$1$2" ); } else { return( "" ); } } if ( $str =~ m/$re_single/ ) { return( $1 ); } else { return( "" ); } }; my $String = "This is line 1 This is Line 2 This is Line 3 This is Line 4 This is Line 5 This is Line 6 This is Line 7 This is Line 8 This is Line 9 This is Line 10"; print("Last\[5\] Lines.\[\n" . Last_N_Lines($String,5) . "\]\n");
Outputs:
Last[5] Lines.[ This is Line 6 This is Line 7 This is Line 8 This is Line 9 This is Line 10]


Comment on Re: Extract Multiple Lines from the End of a multi-line string
Select or Download Code
Re^2: Extract Multiple Lines from the End of a multi-line string
by ikegami (Pope) on Oct 17, 2008 at 04:09 UTC

    Wow that's complex. And for what, to handle \r|\n|\r\n? Even in portable applications that's not necessary. But if you need that functionality for some other reason, converting to \r|\n|\r\n to \n first then just dealing with \n would be much much simpler. Probably faster too.

      I tend to go for ultra portability.. and qr/(?:\r\n|\n\r|\n|\r)/s will likely handle any end-of-line scenario in the modern world. The fact that the two byte sequences (like \r\n) are tested before the single byte sequences (like \n) ensures that the correct behaviour will take place.

        I tend to go for ultra portability..

        I've already said it doesn't help portability.
        On unix, the end of line is \n
        On Windows, the end of line is \n
        On old Macs, the end of line is \n
        On new Macs, the end of line is \n

        qr/(?:\r\n|\n\r|\n|\r)/s will likely handle any end-of-line scenario in the modern world.

        I didn't deny that. I said it should be centralized. Compare

        my $re_single = qr/ (?:\A|[\r\n]) # either the start of text or newline ( [^\r\n]* # any non-newlines (?:\z|(?:\r\n|\n\r|\r|\n)\z) # last newline or eof ) /xs; my $re_multiple = qr/ (?:\A|[\r\n]) # either the start of text or newline ( (?: [^\r\n]* (?:\r\n|\n\r|\r|\n) ){0,$nless} ) # have to enclose above so it all goes into $1 ( [^\r\n]* (?:\z|(?:\r\n|\n\r|\r|\n)\z) # last nl or eof ) /xs;

        to

        my $re_single = qr/ ^ ( .* \n? \z ) /mx; my $re_multiple = qr/ ^ ( (?: .* \n ){0,$nless} ) ( .* \n? \z ) /mx; $String =~ s/\r\n|\n\r|\r|\n/\n/g;

        In short, it's not Last_N_Lines's job to decode IO.


        By the way, there are more improvements you can make.

        • $re_single is just a special case of $re_multiple. You can use $re_multiple even when only one line is needed.
        • You don't need two captures. Combine them into one.
        • This is the perfect place for the ternary operator.

        Compare

        my $re_single = qr/ (?:\A|[\r\n]) # either the start of text or newline ( [^\r\n]* # any non-newlines (?:\z|(?:\r\n|\n\r|\r|\n)\z) # last newline or eof ) /xs; sub Last_N_Lines { my ( $str, $n ) = @_; my $nless = ( $n - 1 ); my $re_multiple = qr/ (?:\A|[\r\n]) # either the start of text or newline ( (?: [^\r\n]* (?:\r\n|\n\r|\r|\n) ){0,$nless} ) # have to enclose above so it all goes into $1 ( [^\r\n]* (?:\z|(?:\r\n|\n\r|\r|\n)\z) # last nl or eof ) /xs; if ( $n > 1 ) { if ( $str =~ m/$re_multiple/ ) { return( "$1$2" ); } else { return( "" ); } } if ( $str =~ m/$re_single/ ) { return( $1 ); } else { return( "" ); } }

        to

        sub Last_N_Lines { my ( $str, $n ) = @_; my $nless = ( $n - 1 ); return $str =~ / ^ ( (?: .* \n ){0,$nless} .* \n? ) \z /mx ? $1 : ""; } $String =~ s/\r\n|\n\r|\r|\n/\n/g;

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://717644]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others examining the Monastery: (6)
As of 2014-09-22 04:20 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    How do you remember the number of days in each month?











    Results (178 votes), past polls