Hi PerlMonks,
I am interested in estimating a few parameters of codon usage from a
coding sequence using the perl module "use Bio::CUA" retrieved from cpan.
I have written a script s1.pl but it fails to work. The above module contains
several other modules inside it. I am at my wit's end to solve the issue.
I need help from perlmonks so that I can get the values of codon usage.
Here goes my script s1.pl:
# Code begins here
################################
# To estimate tAI,CAI & encp of cds:
################################
#!/usr/bin/perl
use warnings;
use strict;
use Bio::CUA; # perl module
my $cds="ATGCGTCTCTTCAAAACGCGCAAATCCACGGATACCTACAGCACACTAGCCGCGCAGCAA
CAGCAACAGCAGCAGCAGCAACAACAACATCAAGCGGAAGGCAGCAACATTTCCCACAGC
AGCAACAGCAGCAGCAACAAGAGTCACACACCGGCAACATGCAGCAACAGACTGAACAAG
AGCATTGTGAGCAGCACCAGCATATCGTCATCGCTGCCTGATCTGCATGACAAGTCGCCC
GTCATGATCCTCAGCTGCACCACCCTGGCCAGCAATGGAGCCACCGCCACGGCAGCGGTC
ACAGCAACAGCCACCGGCACAGCAGCAACATCTGGCGGCTCGCTGCAGCAGCAACAACAG
CAGCATCTGCAACACCAGCAGCAGCAGCAGCCGTTACGCACGGCCACGCCCACGTGTCTG
CTGAGTGGCCGTCAGACGCCATCGGCCATATCGGTGATGTCGCTCCAAGAGGCCACCAGT
CTGCACCGCCAGCAACAGCAGCCACACCAGCCACCCACCATCTACGTGCCGGTGCCTACG
AAACTTGGCAACAATGTCAACACTGGCAACAGCTCGGCCACTCTGCTGCTCAGCTATGGC
AGCACCAGCAGCATCGCCAACCTGCAACAGCAGCAGCAGCAGCATGCCGCCCAGTACCAG
CAGTATGTTGCACAGCGGCTGCACGCCGCTTCCAGCAGTTGTTTGTACGAGAAGGGGTCG
AATGCCAGCGGTGGGGCGAGCAGCAACAAAAGCAGTCTATCCCTGACCCCAAATGGTCAC
TTGCCCGACTACAAGTTGGTGACAGCGATGCCAGTTGTTGTCCTGGACGATGAACACAAA
TCCAATTCATTGCCGGCCACTGAAGCGAGTCGCAACAGCAACAGCAGCAGCAACATGAAC
GGCAGCAGCAACAGCAACAGCCTTGACGTCAGCAACAGCAACTCGCATTCGGGGAGCTCC
ACTTCTTTGGCCAGCACCACGAGAAATGTTTTCACCTGGGGCAAGCGCATGAGTCGCAAA
CTGGATTTGCTGAAGCGGAGTGACTCGCCCGCCGCCGCCCACAAATCGCATTCGGATTTG
AGGAGTCTGTTCCACTCGCCAACGCACCACAAGAGTGGATCCGGTGGATCCAGCGGACCC
AGTTCGGCGAAGGCATCGGCCTCACCCACTGGCGGCCATCAGAACAGCTCCGGCTCCACG
ACCAGCACCCTCAAGAAGTGCAAGTCGGGGCCCATCGAGACCATCAAGCAGCGACACCAG
CAGCAGCAGCAGCAGCAGCAATCGGTTCAGGATGTGGGCACGGGACAGAGCCAGAGTGCT
CAGTCCACGCCCACGCATCAGTTCCAGGCGGCCGCCCGCCCACAGAAAGCGCTAAAGAAC
TTCTTCCATAGGATCGGGTCCACCGGCATGCTGAACCATCGCTCCCACAATCTCCTTAAG
GCTTCGGAGGCGGCTCAACAGGCCACCCCGGCAGCCACCACATTGTATAGGAGCAGCTCC
ACTAGCCAGCTGTCCAGCAGCTCCTATGTGAAGTGCGACGATCCCACCGAGGGACTGAAT
CTTCAGAGGGAGCAGCGGGAACAGCGTCTTCCGCGGATCGCCAGCCTGAAGTCCAGTAGC
TGCGATGACATAGCCAAGGTGAGCAGTTGCCTGACGGCCAGCACAAGTAGTGGCAGTGCC
GCAGGCAGCTTGGGCTCTCCTCCAAGTAGTGCAGCAGCTGGTGGAGGCGGAACTGCAAAC
AGCGGCCAACACGATCCCTCGCGTCGTGGTGCATTTCCTTACGCCTTCCTGCGATCACGT
CTCTCCGTTTTGCCAGAGGAGAACCACGGAAATGTACCAGGACACCTGAAGCAACAAATA
CAGCGGCAACGGGAGCAGCACCAGCAGCATCAGAGGGATCTCCTCCAGCAGGAGCAGACA
TCGTCGCCCCTTCCCCAGCGCCGATCCCCCGAACAGGCGATGCTGAACAATGTGTCACGC
AACGACAGCATCACCTCCAAGGACTGGGAACCACTTTACCAAAGATTAAGTAGTTGTCTA
AGTTCAAACGAGTCCGGCTACGACAGCGATGGGGGTGCGACGGGAGCCCGACTGGGCAAT
AATCTGAGCATCTCCGGCGGAGATACCGAATCTATTGCCTCGGGCACACTCAAGCGTAAC
TCGCTCATCTCCCTCAGCTCCTCGGAGGGCGTTGGAATGGGCATGGGCATGAGTCTGGGA
CTGGGTGCCCCATCGACGAGGAACAGCAGCATCTGCAGTGCTCCCGTGTCGCTGGGTGGC
TATAACTACGACTATGAGACGGAGACGATACGGCGACGATTCAGGCAGGTTAAGCTGGAG
CGCAAGTGCCAAGAGGACTACATCGGAATTGTCCTGTCGCCCAAAACGGTGATGACCAAT
AGCAATGAGCAGCAGTACAGGTATCTCATCGTGGAACTGGAACCCTATGGCATGGCCCAA
AAGGATGGTCGCCTTCGCCTGGGTGACGAGATCGTCAACGTAAATGGAAAACACCTGCGA
GGCATTCAATCCTTTGCAGAGGTTCAGCGCCTGTTGAGCAGCTTTGTGGACAACTGTATC
GACCTGGTGATTGCTCACGATGAGGTGACGACGGTAACTGATTTCTACACCAAAATCCGT
ATCGATGGGATGAGCACGCAGCGCCATCGGCTGAGTTATGTGCAACGCACACAGAGCACA
GACAGTCTGAGCAGCATGCAGAGTCTGCAGCTGCAGCAGGAGAGGATTCAGGGTCACAAT
ACGGAACAGGAGCAGGAGGCCCAGGGCGAGGATCAGTGCGATGCGCGTTCAATGGCCAGC
GTCAGCACAATGCCCACTCCGATGCCGCTGATGCAGCATCGTCGGAGCTCCACGCCCAGG
CACTCACTGGACGTCGGTGCGCCGGAGCATGAGCTCCTCAGGAGGCGGGCGCGCAGCTCC
TCAGGTCAGCGCAGCTTGGCTCTAACACCGACCCCACTCTTTGCCAGCGGCAGCAGCAGT
TGCTCCTCCTCCCCTAACCACCGGTTGCTGGATAACGAGAACGACCCTGCTAACGACACC
GATTCCTATACGCCAGTGTATGCAAATCGGGCGGCAAGCGTGTGCGTGGCCTCCTCCCTG
GCGGACGATGAGAAGTGGCAGTTACTGGCCCGAAAGCGCTGCTCGGAGGGTTCCGCCCTA
TCCGCTACACCGAACCCGCAGCAATTTGGCCAGCGCACTCACTACGCCAGAAACTCCATC
AATCTGGCCAACTCGCATTACCGCTCGCTCCGATTTGCCCACTCGCGGCTGAGTTCGTCT
CGCCTTAGTCTGTTCATGCAGGCACCGCCTAACAGTCTAACCGTCGGAGAAGGAGTCGCT
AACACCCCATCCTCTACAGCTACCACAACCACTGATCTCACTAACCAGCAGCAACAGCAG
CAAAACCAGCAACAGACACACCAATCACTGTACATCAAGCACTCGCCAAAGAGCGTCTCA
TTGTTCTCGCCTAATCCCTATGTTAACGCCTCATCCTCACCAGCTTCGGCATCCACATCA
GCGGGTGCCGGCTCCTCCCTGGCACCGCCAGCCGCTGCCCTAATGCATCACAGGCCATCG
CTTCCGGTGGCCAAGCTAACAATACGCGACGAGGAAATGGCGGAGGTCATCCGTGCGTCG
ATGAGCGAGGGTAGTGGACGTTGCACCCCGAAGACTATAACCTTCTTTAAGGGACCTGGA
CTGAAATCGTTGGGCTTCAGCATAGTGGGAGGTCGAGATTCGCCAAAGGGCAACATGGGA
ATTTTTGTAAAGACCGTGTTTCCCTCAGGCCAGGCAGCCGATGATGGCACACTGCAAGCG
GGCGACGAGATTGTAGAGATCAATGGAAACTCTGTGCAGGGCATGAGTCATGCCGAAACC
ATAGGACTCTTTAAGAACGTAAGAGAGGGCACCATTGTGCTAAAAATCTTAAGAAGAAAA
TTACAGAAAGCTAAATCGATGGGTTGCTAG";
$cds=~ s/\s//g;
##########################################################
# The following code has been taken from example of module:
##########################################################
my $calc = Bio::CUA::CUB::Calculator->new(
-codon_table => 1,
-tAI_values => 'tai.out' # from Bio::CUA::CUB::Builder
);
# create an IO to a sequence file
my $io = Bio::CUA::SeqIO->new($cds);
# read each sequence as a Bio::CUA::Seq object from this io
while (my $seq = $io->next_seq)
{
my $tai = $self->tai($seq);
my $CAI = $self->cai($seq);
my $encp =$self->encp_r($seq,[$minTotal,[$A,$T,$C,$G]]);
printf("%10s: %.7f\n", $seq->id, $tai);
printf("%10s: %.7f\n", $seq->id, $CAI);
printf("%10s: %.7f\n", $seq->id, $encp);
my $tai_val=printf("%10s: %.7f\n", $seq->id, $tai);
my $CAI_val=printf("%10s: %.7f\n", $seq->id, $CAI);
my $encp_val=printf("%10s: %.7f\n", $seq->id, $encp);
print "\n Results of Cds:
tAI value=$tai_val; CAI value=$CAI_val; encp value=$encp_val\n";
}
exit;
I get the following results in cmd:
Microsoft Windows [Version 6.1.7600]
Copyright (c) 2009 Microsoft Corporation. All rights reserved.
C:\Users\x>cd desktop
C:\Users\x\Desktop>s1.pl
Global symbol "$self" requires explicit package name at C:\Users\x\Des
+kto
p\s1.pl line 97.
Global symbol "$self" requires explicit package name at C:\Users\x\Des
+kto
p\s1.pl line 98.
Global symbol "$self" requires explicit package name at C:\Users\x\Des
+kto
p\s1.pl line 99.
Global symbol "$minTotal" requires explicit package name at C:\Users\x
+\De
sktop\s1.pl line 99.
Global symbol "$A" requires explicit package name at C:\Users\x\Deskto
+p\s
1.pl line 99.
Global symbol "$T" requires explicit package name at C:\Users\x\Deskto
+p\s
1.pl line 99.
Global symbol "$C" requires explicit package name at C:\Users\x\Deskto
+p\s
1.pl line 99.
Global symbol "$G" requires explicit package name at C:\Users\x\Deskto
+p\s
1.pl line 99.
Execution of C:\Users\x\Desktop\s1.pl aborted due to compilation error
+s.
C:\Users\x\Desktop>