RabidMortal has asked for the wisdom of the Perl Monks concerning the following question:
i need help finding/writing a simple script to edit a very large FASTA (text) file.
the text format of the FASTA file is simple:
>HWI-EAS158_40_3_1_46_535
GTGAATGCGTGATACAGGAATGTTCGTTGTGACCAT
>HWI-EAS158_40_3_1_47_579
AAAGTGAATGCGTGATACAGGAATGTTCGTTGTGAC
>HWI-EAS158_40_3_1_46_731
GTGTCATGCGTGATACAGGAATGTTCGTTGTGAAAA
each file has 6000000 lines, all with the exact same format. i need a script that will go through and trim off x nucleotides from the beginning, and y nucleotides from the end, of each and every sequence in the file. so, the script should not touch anything on the lines beginning with ">"
i would really really appreciate any help.
thank you.