http://www.perlmonks.org?node_id=1018611


in reply to Urgent help required. Need a code to translate a given nucleotide sequence into proteins using codon table.

To be nice, here's the OP's code in a more readable format (i.e. added in line breaks and is inside proper code tags).

#!/usr/bin/perl -w $sequence1='file1.txt'; open(SEQUENCE,$sequence1); $seq=<SEQUENCE>; print $seq, "\n"; $RNA=$seq; $RNA=~s/T/U/g; print "\n here is mRNA $RNA \n"; close SEQUENCE; $rna1=$RNA; print "\n Here is the 1st frame $rna1 \n" ; $rna2=substr($RNA,1) ; print " Here is the 2nd frame $rna2 \n"; $rna3=substr($RNA,2) ; print "Here is the 3rd frame $rna3 \n"; $length1= length$rna1; $length2= length$rna2; $length3= length$rna3; print "1st line ORFs\n"; for ($i = 0; $i <= ($length1 - 3); $i = $i + 3) { $codon1 = substr($rna1, $i, 3); print $codon1," "; } print "2nd line ORFs\n"; for ($i = 0; $i <= ($length2 - 3); $i = $i + 3) { $codon2 = substr($rna2, $i, 3); print $codon2," "; } print "\n 3rd line ORFs\n"; for ($i = 0; $i <= ($length3 - 3); $i = $i + 3) { $codon3 = substr($rna3, $i, 3); print $codon3," "; } local $_ = $RNA ; while ( / AUG /g ) { my $start = pos () - 2 ; if ( / UGA|UAA|UAG /g ) { my $stop = pos ; $gene = substr ( $_ , $start - 1 , $stop - $start + 1 ), $/ ; print "$gene" ; } # The next set of commands translates the ORF found above for an amin +o acid seq. print "\nThe largest reading Frame is:\t\t\t" . $protein { "gene" } . + "\n" ; sub translate { my ( $gene , $reading_frame ) = @_ ; my %protein = (); for ( $i = $reading_frame ; $i < length ( $gene ); $i += 3 ) { $codon = substr ( $gene , $i , 3 ); $amino_acid = translate_codon( $codon ); $protein { $amino_acid }++; $protein { "gene" } .= $amino_acid ; } return %protein ; } sub translate_codon { if ( $_ 0 =~ / GCAGCU /i ) { return A;} # Alanine; if ( $_ 0 =~ / UGC|UGU /i ) { return C;} # Cysteine if ( $_ 0 =~ / GAC|GAU /i ) { return D;} # Aspartic Acid; if ( $_ 0 =~ / GAA|GAG /i ) { return Q;} # Glutamine; if ( $_ 0 =~ / UUC|UUU /i ) { return F;} # Phenylalanine; if ( $_ 0 =~ / GGAGCU /i ) { return G;} # Glycine; if ( $_ 0 =~ / CAC|CAU /i ) { return His;} # Histine (start codon); + if ( $_ 0 =~ / AUAUC /i ) { return I;} # Isoleucine; if ( $_ 0 =~ / AAA|AAG /i ) { return K;} # Lysine; if ( $_ 0 =~ / UUA|UUG|CUAGCU /i ) { return Leu;} # Leucine; if ( $_ 0 =~ / AUG /i ) { return M;} # Methionine; if ( $_ 0 =~ / AAC|AAU /i ) { return N;} # Asparagine; if ( $_ 0 =~ / CCAGCU /i ) { return P;} # Proline; if ( $_ 0 =~ / CAA|CAG /i ) { return G;} # Glutamine; if ( $_ 0 =~ / AGA|AGG|CGAGCU /i ) { return R;} # Arginine; if ( $_ 0 =~ / AGC|AGU|UCAGCU /i ) { return S;} # Serine; if ( $_ 0 =~ / ACAGCU /i ) { return T;} # Threonine; if ( $_ 0 =~ / GUAGCU /i ) { return V;} # Valine; if ( $_ 0 =~ / UGG /i ) { return W;} # Tryptophan; if ( $_ 0 =~ / UAC|UAU /i ) { return Y;} # Tyrosine; if ( $_ 0 =~ / UAA|UGA|UAG /i ) { return "***" ;} # Stop Codons; } } exit;

In the process of reformatting, I suspect that there are some issues with curly brackets. It looks like the while loop's open curly bracket isn't closed off until after the subroutines. I suspect that might be an issue and definitely agree with kennethk's suggestion about using strict and warnings.

  • Comment on Re: Urgent help required. Need a code to translate a given nucleotide sequence into proteins using codon table.
  • Download Code

Replies are listed 'Best First'.
Re^2: Urgent help required. Need a code to translate a given nucleotide sequence into proteins using codon table.
by kennethk (Abbot) on Feb 13, 2013 at 19:53 UTC
    For future reference, perlmonks linkifies anything between [ and ], so locations in the posted code that display as links should actually be array indices or character classes. For example
    if ( $_ 0 =~ / GGAGCU /i ) { return G;} # Glycine;
    is printed above as

    if ( $_ 0 =~ / GGAGCU /i ) { return G;} # Glycine;

    and should therefore actually probably be translated as

    if ( $_[0] =~ /GG[AGCU]/i ) { return G;} # Glycine;
    barring ambiguities with regard to whitespace counts. Not that I've done this before....

    #11929 First ask yourself `How would I do this without a computer?' Then have the computer do it the same way.

Re^2: Urgent help required. Need a code to translate a given nucleotide sequence into proteins using codon table.
by suchetana (Initiate) on Feb 13, 2013 at 19:52 UTC
    i do have -w given, so it should read the bugs and warn, right?

      warnings and strict cover different issues. Without strict, you get some very regressive behaviors that you almost assuredly don't intend, e.g. my point about barewords. Did you try addressing that issue yet?


      #11929 First ask yourself `How would I do this without a computer?' Then have the computer do it the same way.