I am trying to grasp the regular expressions and various uses for them. I am working on a problem where I am given a string of letters and have to use a regular expression to find the sequences that starts with ATG and ends with TAG, TAA, or TGA. I am having trouble figuring out the regular expression that would search for each of these endings in a single expression.
Here is what I have so far
#!/usr/bin/perl
use strict;
use warnings;
use diagnostics;
#insert sequence
my $seq = 'AATGGTTTCTCCCATCTCTCCATCGGCATAAAAATACAGAATGATCTAACGAA';
#find codons
while ($seq =~ m/ATG(.*)(TAG|TAA|TGA)/g){
#print codons
print $1, "\n";
}
but I am not getting the correct output, instead getting that the $1 is unspecified.
Any suggestions? I would really like to understand how this sort of regular expression works.
Thank you!