|We don't bite newbies here... much|
Removing nucleotide frm sequenceby bingalee (Acolyte)
|on Jun 06, 2013 at 14:38 UTC||Need Help??|
bingalee has asked for the
wisdom of the Perl Monks concerning the following question:
Hi, I'm a beginner to perl programming. I need to create a script for removing the nucleotides from many sequences. My data looks something like this
this is like one set, there are many sets like this in the file. so if i want to remove the last 5 "a" frm the sequence, and its corresponding quality (>CC@B) and do this for all the sequences, how do i go about it. First I thought i should split it into arrays using the '+' but then i will have to remove the last five elements of each element of the array. and join them and resplit them differently so that the next time i can remove the last 5"quality" data from each element of the array. I'm sure there's a less complicated procedure..can anyone help mme out here please?
sorry if I framed my question wrong
So i need to remove the last 5 Nucleotides from each sequence, irrespective of whether its an "a" or not, sorry if i said so otherwise.
Also i need to remove the corresponding quality of the nucleotides which are basically the symbol like characters.Like in the first sequence if I'm removing "AAAAA" i need to also remove ">CC@B".
is it doable? :(