perlquestion
charm
<p>Perlmonks, This is the question I am having trouble answering</p>
<p>
Use random DNA sequence generator as many times as you need to get the protein-coding region(s) (nucleotide triplets). Minimum sequence length is 500bp.
</p>
<p>• Find all possible protein-coding regions in microbes (between start and stop codons, ATG – TAG|TGA|TAA). </p>
<p>
• Using Standard Genetic Code table (Wikipedia or any other sources), create hash table for all amino acids (<c>$amino{TTT} => "F", ...</c>). </p>
<p>
• Write found protein sequences into file with the next format:<br />
Position of 1st start codon: protein sequence [length of protein sequence]<br />
Position of 2nd start codon: ….
</p>
<p>
For example: <c>45: FLPQWCV [7]</c>
</p>
<p>
I am sort of clueless and not sure where to start. To find the coding sequence I have generated a 5000bp random nucleotide but everytime i use the code below to find a coding region it returns nothing. Can anyone tell me what i am doing wrong? </p>
<code>
@nucs=("A","C","G","T");
$size=5000;
for ($i=0; $i<$size; $i++) {
$seqR .= $nucs[int(rand(4))];
}
print "Seq($size): $seqR\n";
if (/ATG([ACGT][ACGT][ACGT]){3,5000}(TAA|TAG|TGA)/) {
print "This seq. might contain a coding region\n"
} else{
print "This sequence most liklely does not contin a coding region\n"
}
</code>