Beefy Boxes and Bandwidth Generously Provided by pair Networks
Problems? Is your data what you think it is?
 
PerlMonks  

Re^3: Extract Multiple Lines from the End of a multi-line string

by monarch (Priest)
on Oct 19, 2008 at 12:03 UTC ( #718046=note: print w/ replies, xml ) Need Help??


in reply to Re^2: Extract Multiple Lines from the End of a multi-line string
in thread Extract Multiple Lines from the End of a multi-line string

I tend to go for ultra portability.. and qr/(?:\r\n|\n\r|\n|\r)/s will likely handle any end-of-line scenario in the modern world. The fact that the two byte sequences (like \r\n) are tested before the single byte sequences (like \n) ensures that the correct behaviour will take place.


Comment on Re^3: Extract Multiple Lines from the End of a multi-line string
Select or Download Code
Re^4: Extract Multiple Lines from the End of a multi-line string
by ikegami (Pope) on Oct 19, 2008 at 17:01 UTC

    I tend to go for ultra portability..

    I've already said it doesn't help portability.
    On unix, the end of line is \n
    On Windows, the end of line is \n
    On old Macs, the end of line is \n
    On new Macs, the end of line is \n

    qr/(?:\r\n|\n\r|\n|\r)/s will likely handle any end-of-line scenario in the modern world.

    I didn't deny that. I said it should be centralized. Compare

    my $re_single = qr/ (?:\A|[\r\n]) # either the start of text or newline ( [^\r\n]* # any non-newlines (?:\z|(?:\r\n|\n\r|\r|\n)\z) # last newline or eof ) /xs; my $re_multiple = qr/ (?:\A|[\r\n]) # either the start of text or newline ( (?: [^\r\n]* (?:\r\n|\n\r|\r|\n) ){0,$nless} ) # have to enclose above so it all goes into $1 ( [^\r\n]* (?:\z|(?:\r\n|\n\r|\r|\n)\z) # last nl or eof ) /xs;

    to

    my $re_single = qr/ ^ ( .* \n? \z ) /mx; my $re_multiple = qr/ ^ ( (?: .* \n ){0,$nless} ) ( .* \n? \z ) /mx; $String =~ s/\r\n|\n\r|\r|\n/\n/g;

    In short, it's not Last_N_Lines's job to decode IO.


    By the way, there are more improvements you can make.

    • $re_single is just a special case of $re_multiple. You can use $re_multiple even when only one line is needed.
    • You don't need two captures. Combine them into one.
    • This is the perfect place for the ternary operator.

    Compare

    my $re_single = qr/ (?:\A|[\r\n]) # either the start of text or newline ( [^\r\n]* # any non-newlines (?:\z|(?:\r\n|\n\r|\r|\n)\z) # last newline or eof ) /xs; sub Last_N_Lines { my ( $str, $n ) = @_; my $nless = ( $n - 1 ); my $re_multiple = qr/ (?:\A|[\r\n]) # either the start of text or newline ( (?: [^\r\n]* (?:\r\n|\n\r|\r|\n) ){0,$nless} ) # have to enclose above so it all goes into $1 ( [^\r\n]* (?:\z|(?:\r\n|\n\r|\r|\n)\z) # last nl or eof ) /xs; if ( $n > 1 ) { if ( $str =~ m/$re_multiple/ ) { return( "$1$2" ); } else { return( "" ); } } if ( $str =~ m/$re_single/ ) { return( $1 ); } else { return( "" ); } }

    to

    sub Last_N_Lines { my ( $str, $n ) = @_; my $nless = ( $n - 1 ); return $str =~ / ^ ( (?: .* \n ){0,$nless} .* \n? ) \z /mx ? $1 : ""; } $String =~ s/\r\n|\n\r|\r|\n/\n/g;

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://718046]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others wandering the Monastery: (21)
As of 2014-09-18 14:02 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    How do you remember the number of days in each month?











    Results (116 votes), past polls