http://www.perlmonks.org?node_id=833644

RabidMortal has asked for the wisdom of the Perl Monks concerning the following question:

hello,

i need help finding/writing a simple script to edit a very large FASTA (text) file.

the text format of the FASTA file is simple:

>HWI-EAS158_40_3_1_46_535
GTGAATGCGTGATACAGGAATGTTCGTTGTGACCAT
>HWI-EAS158_40_3_1_47_579
AAAGTGAATGCGTGATACAGGAATGTTCGTTGTGAC
>HWI-EAS158_40_3_1_46_731
GTGTCATGCGTGATACAGGAATGTTCGTTGTGAAAA

each file has 6000000 lines, all with the exact same format. i need a script that will go through and trim off x nucleotides from the beginning, and y nucleotides from the end, of each and every sequence in the file. so, the script should not touch anything on the lines beginning with ">"

i would really really appreciate any help.

thank you.