http://www.perlmonks.org?node_id=974073


in reply to alphabet counting

Here is a solution to counting the letters taken from the BioPerl webpage, the 'HOWTO:Beginners' link on the opening page, then link 17 - (Obtaining basic sequence statistics). That led to the docs on Bio::Tools::SeqStats.

#!/usr/bin/perl use strict; use warnings; use Bio::SeqIO; use Bio::Tools::SeqStats; my @prot = qw/ A C D E F G H I K L M N P Q R S T V W Y /; my $outputfile = "countaa"; open my $OUT, ">", $outputfile or die "Can't open file \"$outputfile\" to write to $!\n\n"; my $proteinio=Bio::SeqIO->new (-file=>"ec 1.1.1.fasta",-format=>'fasta +'); while(my $seq = $proteinio->next_seq() ) { my $seq_stats = Bio::Tools::SeqStats->new(-seq => $seq); my $count = $seq_stats->count_monomers(); print $OUT join(' ', map {$_ || 0} @$count{ @prot }), "\n"; } close $OUT or die $!;

Hope this is of some help,

Chris