Beefy Boxes and Bandwidth Generously Provided by pair Networks
We don't bite newbies here... much
 
PerlMonks  

Re: Bioperl or ncbi: parsing refseq files

by BioLion (Curate)
on Jun 14, 2010 at 15:45 UTC ( #844644=note: print w/ replies, xml ) Need Help??


in reply to Bioperl or ncbi: parsing refseq files

Hi, your flat genbank files can be handled using the BioPerl suite. See this HowTo for a very detailed guide to IO, which includes genbank parsers (and a lot of others).

You'll be using the Bio::SeqIO module ( See Yes, even you can use CPAN for a guide on getting modules installed, or ignore if i am patronising you... ) to read in the files, test each sequence feature if it is in your region of interest, and if it is, write it out to a fresh (smaller) file. You can do all this on-the-fly, so your large file shouldn't trouble memory problems...

Have a read, have a go, and get back to the Monastery (with examples and code) if you are having problems. Hope this helps!

Update: Typos...

Just a something something...


Comment on Re: Bioperl or ncbi: parsing refseq files
Re^2: Bioperl or ncbi: parsing refseq files
by roibrodo (Sexton) on Jun 15, 2010 at 09:30 UTC

    Thanks for the reply.

    I'm not sure I got it right. Even if the feature is not fully in the region of interest, but only partially in it, I want to "truncate" it and take it. I also want the sequence (that appears after all the features) to be outputted. Basically, I want to do exactly what the "change region shown" does on the online version of NCBI.

    I would appreciate a more verbose example, if possible, since this are my first steps with BioPerl.

    Thanks!

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://844644]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others musing on the Monastery: (4)
As of 2014-07-28 05:26 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    My favorite superfluous repetitious redundant duplicative phrase is:









    Results (186 votes), past polls