Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl Monk, Perl Meditation
 
PerlMonks  

Re^2: Bioinformatics: Regex loop, no output

by AnomalousMonk (Archbishop)
on Nov 16, 2015 at 22:32 UTC ( [id://1147851]=note: print w/replies, xml ) Need Help??


in reply to Re: Bioinformatics: Regex loop, no output
in thread Bioinformatics: Regex loop, no output

The output from your code shows some problems:

The peptide is DAAAAATTLTTTAMTTTTTTC The peptide is MMFRPPPPPGGGGGGGGGGGG The peptide is ALTAMCMNVWEITYH The peptide is GSDVN The peptide is The peptide is ASFAQPPPQPPPPLLAIKPASDASD
The K or R terminating split codon (if that's the proper term) is being incorrectly removed from the output peptides. (At least, I think this is incorrect. TamaDP doesn't show desired output, but seems satisfied with output examples given in various replies in this thread that include these codons.) So I assume  GSDVN should really be  GSDVNR and the "null" sequence following it should really be the single-codon sequence R. This is all down to the incorrect definition of the  s/// match pattern; take a look at some other replies in this thread for what I feel are more correct  s/// patterns.

In an unrelated note, the regex in the condition expression of the
    if ($protein =~ m/[K(?!P)|R(?!P)]/g) { ... }
block isn't doing what I think you think it's doing. The  [K(?!P)|R(?!P)] character class is exactly equivalent to the  [KPR()?!|] class; metacharacters (alternations, groupings, etc.) have no meaning in a character class, so  ()?!| are just literal characters (and repeated characters have no effect whatsoever). Also, the  /g modifier in the  m//g match is useless in the boolean context of a conditional, although it does no harm (except to burn a few more innocent computrons). Again, all this doesn't affect the basic problem with the code, which stems from the incorrect  s/// match.

I use Data::Dumper all the time because I've been fooled by my data too many times.

Yea and amen brother, yea and amen.


Give a man a fish:  <%-{-{-{-<

Replies are listed 'Best First'.
Re^3: Bioinformatics: Regex loop, no output
by tonto (Friar) on Nov 17, 2015 at 21:05 UTC

    Thank you! I wondered if I understood what was wanted, later posts show that I didn't. I shouldn't have posted that, I'll stop myself next time.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://1147851]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others taking refuge in the Monastery: (5)
As of 2024-09-20 18:19 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    The PerlMonks site front end has:





    Results (26 votes). Check out past polls.

    Notices?
    erzuuli‥ 🛈The London Perl and Raku Workshop takes place on 26th Oct 2024. If your company depends on Perl, please consider sponsoring and/or attending.