Beefy Boxes and Bandwidth Generously Provided by pair Networks
Do you know where your variables are?
 
PerlMonks  

Re: please help me!!

by robsv (Curate)
on Apr 11, 2002 at 18:15 UTC ( #158379=note: print w/ replies, xml ) Need Help??


in reply to please help me parse genbank DNA file

Have you looked into Bioperl? It will simplify parsing for you (especially for the sequence itself). Here's a program that gets the sequence and some other basic information:

#!/usr/local/bin/perl -w use strict; use Bio::SeqIO; my $seqobj; print "please type in the name of a file\n"; my $file = <STDIN>; my $seqio = Bio::SeqIO->new (-format => 'GenBank', -file => $file); while ($seqobj = $seqio->next_seq()) { printf "Sequence: %s\n",$seqobj->seq; # I'm not sure what you need other than the # sequence - here's some examples: printf "Display ID: %s\n",$seqobj->display_id; printf "Description: %s\n",$seqobj->desc; printf "Division: %s\n",$seqobj->division; printf "Accession: %s\n",$seqobj->accession; }
In your program, you're putting all of the non-sequence lines into @annotation. I'm not sure specifically which information you need (i.e. descriprtion, accession number, etc.), but those are all accessible through the "$seqobj" object. There's some examples in the code above; you'll find many more in the documentation.

This method also has the advantage of being able to handle multiple GenBank records per file.

This is just a tiny portion of the functions available with BioPerl - it will also parse BLAST files, perform alignments, etc. If you're interested, you can grab the latest release from CPAN or from BioPerl here. Hope this helps!

- robsv


Comment on Re: please help me!!
Download Code

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://158379]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others meditating upon the Monastery: (16)
As of 2015-07-28 15:04 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    The top three priorities of my open tasks are (in descending order of likelihood to be worked on) ...









    Results (257 votes), past polls