Beefy Boxes and Bandwidth Generously Provided by pair Networks
No such thing as a small change

Extracting IDs from fasta file

by MBobur (Initiate)
on Mar 01, 2013 at 09:53 UTC ( #1021209=perlquestion: print w/replies, xml ) Need Help??
MBobur has asked for the wisdom of the Perl Monks concerning the following question:

Hello. I'm need extract gene ID from fasta format file which includes multiple genes with their sequences. It looks like below:

>BTBSCRYR|IV|123123-43245273 tgcaccaaacatgtctaaagctggaaccaaaattactttctttgaagacaaaaaggccgccactatgaca +gcgattgcgactgtgcagatttccacatgtacctgagccgctg >BTBSCRYADASR|IV|123123-43245273 gagctcccgggggagggaggacggccgggccggggcgctaagacccggggcgcggtggtagaggttccca +gcaggacactagagggcgatccccggccctgctgcgggggtgtatata gagctc
My final txt file must look like or include just gene ID's:
Can some one make me a script which can do that. Thank you in advance

Replies are listed 'Best First'.
Re: Extracting IDs from fasta file
by marto (Bishop) on Mar 01, 2013 at 09:58 UTC

    This isn't a code writing service. Consider actually reading and understanding the links given here.

Re: Extracting IDs from fasta file
by tmharish (Friar) on Mar 01, 2013 at 10:01 UTC
    Can some one make me a script which can do that.

    Short answer: No

    Long answers: Here, Here and Here

Re: Extracting IDs from fasta file
by 2teez (Vicar) on Mar 01, 2013 at 12:36 UTC

    Hi MBobur,
    If I must add my voice to the wonderful wisdom, you have gotten before now.
    Which would you rather prefer

    • To follow both
      marto's wisdom in Re: intron length to one of your post,
      and that of
      blue_cowdawg which says:
      fellow monk, there are better ways to ask homework questions. First step in my opinion is to try and solve the problem yourself. If it works, you've accomplished something truly great. If it crashes, burns and dies in the flames then you'll accomplish even greater things by posting a well titled and formatted question to Seekers of Perl Wisdom so that others in a similar predicament as yours can learn by it. At least, in my honest opinion, is how Perl Monks is supposed to work.
    • OR, I write this for you:
      use warnings; use strict; while (<DATA>) { print $1, $/ if m[^>(?=(.+?)\|)]; } __DATA__ >BTBSCRYR|IV|123123-43245273 tgcaccaaacatgtctaaagctggaaccaaaattactttct +ttgaagacaaaaaggccgccactatgacagcgattgcgactgtgcagatttccacatgtacctgagccg +ctg >BTBSCRYADASR|IV|123123-43245273
      And you have to keep asking, what which stand for and how to do the next assignment/ or project?

    I think, you will grow and become a better Perl user, even helping others if you choose to follow the wisdom(s) of those who has walked your path before now.
    However, if all these doesn't make sense then you can check this: How do I post a question effectively?

    If you tell me, I'll forget.
    If you show me, I'll remember.
    if you involve me, I'll understand.
    --- Author unknown to me
Re: Extracting IDs from fasta file
by Sinistral (Prior) on Mar 01, 2013 at 16:49 UTC

    Help With FASTA

    In addition to the many useful responses you've received, one source of wisdom for Bioinformatics + Perl is the BioPerl Site, which you may not previously have been exposed to. I highly recommend you begin your search for a solution there. A quick search revealed a BioPerl page devoted just to FASTA, which will in turn lead you to the many Bio-Perl-Monks who have already developed solutions for just your problem.

Re: Extracting IDs from fasta file
by MBobur (Initiate) on Mar 01, 2013 at 10:00 UTC

    Each gene in a file starts by title line followed by sequence in a new line as well. Each of IDs in final file should be written one by one in each line. Thank you.

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://1021209]
Approved by Corion
and all is quiet...

How do I use this? | Other CB clients
Other Users?
Others examining the Monastery: (5)
As of 2018-02-25 20:11 GMT
Find Nodes?
    Voting Booth?
    When it is dark outside I am happiest to see ...

    Results (314 votes). Check out past polls.