Beefy Boxes and Bandwidth Generously Provided by pair Networks
laziness, impatience, and hubris

Re: Bioperl or ncbi: parsing refseq files

by BioLion (Curate)
on Jun 14, 2010 at 15:45 UTC ( #844644=note: print w/replies, xml ) Need Help??

in reply to Bioperl or ncbi: parsing refseq files

Hi, your flat genbank files can be handled using the BioPerl suite. See this HowTo for a very detailed guide to IO, which includes genbank parsers (and a lot of others).

You'll be using the Bio::SeqIO module ( See Yes, even you can use CPAN for a guide on getting modules installed, or ignore if i am patronising you... ) to read in the files, test each sequence feature if it is in your region of interest, and if it is, write it out to a fresh (smaller) file. You can do all this on-the-fly, so your large file shouldn't trouble memory problems...

Have a read, have a go, and get back to the Monastery (with examples and code) if you are having problems. Hope this helps!

Update: Typos...

Just a something something...
  • Comment on Re: Bioperl or ncbi: parsing refseq files

Replies are listed 'Best First'.
Re^2: Bioperl or ncbi: parsing refseq files
by roibrodo (Sexton) on Jun 15, 2010 at 09:30 UTC

    Thanks for the reply.

    I'm not sure I got it right. Even if the feature is not fully in the region of interest, but only partially in it, I want to "truncate" it and take it. I also want the sequence (that appears after all the features) to be outputted. Basically, I want to do exactly what the "change region shown" does on the online version of NCBI.

    I would appreciate a more verbose example, if possible, since this are my first steps with BioPerl.


Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://844644]
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others lurking in the Monastery: (3)
As of 2020-04-05 23:44 GMT
Find Nodes?
    Voting Booth?
    The most amusing oxymoron is:

    Results (36 votes). Check out past polls.