Beefy Boxes and Bandwidth Generously Provided by pair Networks
There's more than one way to do things

Getting the last line

by rolandde (Initiate)
on Oct 06, 2011 at 02:39 UTC ( #929911=perlquestion: print w/replies, xml ) Need Help??
rolandde has asked for the wisdom of the Perl Monks concerning the following question:

I'm trying to import a specific file format (FASTA):
my$sequence_to_parse =">test\nATG\nGGG"; while ($sequence_to_parse=~/^>.*\n(^(?!>).*$)+/gm) {print "$&\n";}
This regular expression always missed the last line (GGG in this case) and I cannot figure out why.

Replies are listed 'Best First'.
Re: Getting the last line
by Anonymous Monk on Oct 06, 2011 at 03:13 UTC

    For the simplest of reasons, it doesn't match :)

    With re 'debug'


    One problem I think I see is trying to use look-ahead as look-behind (?!pattern)

    I would try something simple, like

    my $sequence_to_parse =">test\nATG\nGGG"; while ( $sequence_to_parse =~ m/(^>.+)|(^.+)/gm ) { if ( defined $1 ) { print "got first line \$1 ($1)\n"; } elsif ( defined $2 ) { print "got other line \$2 ($2)\n"; } else { print "UH OH \n"; } } __END__ got first line $1 (>test) got other line $2 (ATG) got other line $2 (GGG)

      To get just the last line, since lines are chars not \r\n,

      my $sequence_to_parse =">test\nATG\nGGG"; print $sequence_to_parse =~ /([^\r\n]+)$/s; __END__ GGG

      use YAPE::Regex::Explain; print YAPE::Regex::Explain->new( qr/([^\r\n]+)$/s )->explain; __END__ The regular expression: (?s-imx:([^\r\n]+)$) matches as follows: NODE EXPLANATION ---------------------------------------------------------------------- (?s-imx: group, but do not capture (with . matching \n) (case-sensitive) (with ^ and $ matching normally) (matching whitespace and # normally): ---------------------------------------------------------------------- ( group and capture to \1: ---------------------------------------------------------------------- [^\r\n]+ any character except: '\r' (carriage return), '\n' (newline) (1 or more times (matching the most amount possible)) ---------------------------------------------------------------------- ) end of \1 ---------------------------------------------------------------------- $ before an optional \n, and the end of the string ---------------------------------------------------------------------- ) end of grouping ----------------------------------------------------------------------
Re: Getting the last line
by ramprasad27 (Sexton) on Oct 06, 2011 at 06:50 UTC
    try this
    my $sequence_to_parse =">test\nATG\nGGG"; while ($sequence_to_parse=~/^>.*\n(^(?!>).*\n.*$)+/gm) {print "$&\n";}

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://929911]
Approved by keszler
and the fire pops...

How do I use this? | Other CB clients
Other Users?
Others examining the Monastery: (4)
As of 2017-10-18 11:52 GMT
Find Nodes?
    Voting Booth?
    My fridge is mostly full of:

    Results (244 votes). Check out past polls.