Hello All,
I am trying to parse a fasta file using perl. The following is the input file:
>CVSF43565.d1 bg|346278
CAGACACACTTCTTTTAGTTGAGACACATGGAAAACATCATGTATGGCAGACAACTGTTCTGGGAGTTGG
ATCCGGGTAAGCAACGGGTCCACATATCTCCACAATCTCATAAGGGGCCAACATAGCGGGGGAGCTAACT
TGCCTTTGATTCCAAACCGTTGCACTCCTTTGGTCGGGGAAACTCGAAGGTACACATGATCACCAAGGTC
GAACTGCAGGGGTCTTCTCCGCTGGTCGGAGTAGCTCTTCTGTCAAGATTGGGCGGCCTTGAGATGTGCT
TGAATTACCTTCACTTGCTCTTCGGCTTCTGCCACTTAAGTCAGGGCCATAGACCTGTCTCTCCCCTGGG
CAGACACACTTCTTTTAGTTGAGACACATGGAAAACATCATGTATGGCAGACAACTGTTCTGGGAGTTGG
ATCCGGGTAAGC
>CVSF43566.d1 bg|346279
CAGACACACTTCTTTTAGTTGAGACACATGGAAAACATCATGTATGGCAGACAACTGTTCTGGGAGTTGG
ATCCGGGTAAGCCAGACACACTTCTTTTAGTTGAGACACATGGAAAACATCATGTATGGCAGACAACTGT
TCTGGGAGTTGGAATGCTAGTCGATCGCCAGACACACTTCTTTTAGTTGAGACACATGGAAAACATCATG
TTGGCAGACAACTGTTCTGGGAGTTGGATCCGGGTAAGCCAGACACACTTCTTTTAGTTGAGACACATGG
AAAACATCATGTATGGCAGACAACTGTTCTGGGAGTTGGATCCGGGTAAGC
>CVSF43567.d1 bg|346280
CGTAGCTGATGCTGTGCTGTTGTGTCGGGGGGATATATATATATATATGGGGTCGTAGTCGTAGCGCTAG
TATGCTAGCAGCGTAGATGCTGATCGATGCTGATGCTGATCGTAGTCGTAGGCTAGTGCGATCGTAGTCG
TAGTCGATGCTGATGCGTAGCTGATGTGCTGCTGATGCTAGTCGTCGTAGCTGATGCATGCTGATCGTAG
TGCTCGATGCTAGTCGTAGTCGTAGTCGTAGCGACTGATGCGATCGTAGTCGGATGCTAGCACGTAGCTG
GCTCGATGCTGATGCTGAT
>CVSF10000.x1 bg|356789 pair:789860
ATGCGTAGCTGATGTGCTGCTGATGCTAGTCGTCGTAGCTGATGCATGCTGATCGTAGTGCTCGATGCTA
GTCGTAGTCGTAGTCGTAGCGACTGATGCGATCGTAGTCGGATGATGCTGACTGATGCTGATCTGTACGT
CGTAGCTGATGCATGCGCTAGTAGCT
>CVSF10000.y1 bg|356790 pair:789859
GCTAGTCGATGCTGATGCTGTAGCTAGCGTAGTCGTACGCGCGCGCGCGCGTTTTTTGTGACGTCGTAGT
CCGTAGCTGATGCGATGCTAGTGCTGTGTCAGCTGATGTCGTGTGTAGCTGATGCTGATCGTTCGTGTGT
CGATGCTGATGCTAGTCGTAGTGTAT
>CVSF10001.x1 bg|356791 pair:789862
AGTCGTAGTCGTAGCTGTAGCTGATGCTGTGTACGATGCTGATGCGATGCGTAGCGTAGCATCGATGCTA
CGACTAGTCGTAGTCGTC
>CVSF10001.y1 bg|356792 pair:789861
CGTAGCTGATGCTGATCGTAGTCGTAGTCGATGCGATGCTAGTCGTAGCTGTAGCTGATGCTGCGTGCTG
CAGTCGATGCTAGTCGATGCTGATCGTCTAGCAT
I want to write the lines(and the data that follows) with "pairs" field in one file and the lines without "pairs" field in another.
However, with the following code I am only able to write the header lines. But I also want the data following the header line(ATGCTAGCTG....) to be included in the output files.
Any inputs??
#!/usr/bin/perl
my $in = $ARGV[0];
my $p = $ARGV[1];
my $s = $ARGV[2];
open IN, "<$in" or die $!;
open P_OUT, ">$p" or die $!;
open S_OUT, ">$s" or die $!;
while(<IN>){
chomp;
if(/^>/){
my @header = split / /;
if($header[2] ne ''){
print P_OUT "$header[0]"." "."$header[1]"." "."$header[2]\n";
}
else{
print S_OUT "$header[0]"." "."$header[1]\n";
}
}
#unless(/^>/){
#print OUT "$_\n";
#next;
#}
}
close(IN);
close(P_OUT);
close(S_OUT);
Thanks!!!