Hello, I am really sorry for not being specific. I should have displayed what I was trying. So here it is:
#!/usr/bin/env perl
use 5.010;
for (<>) {
if (/^>/) {
# Header
} elsif (/^[A-Z]+$/) {
# Protein
my $a = tr/A/A/;
say "A: $a, length: " . length;
}
}
~
There are two issues I am facing right now. First, some of the sequence entries in the input file are long and are continued on the next line (see below for example). But this script reads only the first line (before moving on to the second entry) due to which I'm getting wrong values for the length and number of 'A's that I want. Is there a way to fix this?
Example sequence:
>sp|P76347|YEEJ_ECOLI Uncharacterized protein YeeJ OS=Escherichia coli
+ (strain K12) OX=83333 GN=yeeJ PE=3 SV=3
MATKKRSGEEINDRQILCGMGIKLRRLTAGICLITQLAFPMAAAAQGVVNAATQQPVPAQ
IAIANANTVPYTLGALESAQSVAERFGISVAELRKLNQFRTFARGFDNVRQGDELDVPAQ
VSEKKLTPPPGNSSDNLEQQIASTSQQIGSLLAEDMNSEQAANMARGWASSQASGAMTDW
LSRFGTARITLGVDEDFSLKNSQFDFLHPWYETPDNLFFSQHTLHRTDERTQINNGLGWR
HFTPTWMSGINFFFDHDLSRYHSRAGIGAEYWRDYLKLSSNGYLRLTNWRSAPELDNDYE
ARPANGWDVRAESWLPAWPHLGGKLVYEQYYGDEVA
Second, This script is giving me the output on the terminal. I want it to give me the output in a file. How and where do I declare the output file details?