Have you looked into Bioperl? It will simplify parsing for you (especially for the sequence itself). Here's a program that gets the sequence and some other basic information:
in reply to please help me parse genbank DNA file
In your program, you're putting all of the non-sequence lines into @annotation. I'm not sure specifically which information you need (i.e. descriprtion, accession number, etc.), but those are all accessible through the "$seqobj" object. There's some examples in the code above; you'll find many more in the documentation.
print "please type in the name of a file\n";
my $file = <STDIN>;
my $seqio = Bio::SeqIO->new (-format => 'GenBank',
-file => $file);
while ($seqobj = $seqio->next_seq())
printf "Sequence: %s\n",$seqobj->seq;
# I'm not sure what you need other than the
# sequence - here's some examples:
printf "Display ID: %s\n",$seqobj->display_id;
printf "Description: %s\n",$seqobj->desc;
printf "Division: %s\n",$seqobj->division;
printf "Accession: %s\n",$seqobj->accession;
This method also has the advantage of being able to handle multiple GenBank records per file.
This is just a tiny portion of the functions available with BioPerl - it will also parse BLAST files, perform alignments, etc. If you're interested, you can grab the latest release from CPAN or from BioPerl here. Hope this helps!