Re^4: How to count the length of a sequence of alphabets and number of occurence of a particular alphabet in the sequence?

Hello, I am really sorry for not being specific. I should have displayed what I was trying. So here it is:

#!/usr/bin/env perl

use 5.010;

for (<>) {
    if (/^>/) {
        # Header
    } elsif (/^[A-Z]+$/) {
        # Protein
        my $a = tr/A/A/;
        say "A: $a, length: " . length;
    }
}
~
[download]

There are two issues I am facing right now. First, some of the sequence entries in the input file are long and are continued on the next line (see below for example). But this script reads only the first line (before moving on to the second entry) due to which I'm getting wrong values for the length and number of 'A's that I want. Is there a way to fix this?

Example sequence:

>sp|P76347|YEEJ_ECOLI Uncharacterized protein YeeJ OS=Escherichia coli
+ (strain K12) OX=83333 GN=yeeJ PE=3 SV=3
MATKKRSGEEINDRQILCGMGIKLRRLTAGICLITQLAFPMAAAAQGVVNAATQQPVPAQ
IAIANANTVPYTLGALESAQSVAERFGISVAELRKLNQFRTFARGFDNVRQGDELDVPAQ
VSEKKLTPPPGNSSDNLEQQIASTSQQIGSLLAEDMNSEQAANMARGWASSQASGAMTDW
LSRFGTARITLGVDEDFSLKNSQFDFLHPWYETPDNLFFSQHTLHRTDERTQINNGLGWR
HFTPTWMSGINFFFDHDLSRYHSRAGIGAEYWRDYLKLSSNGYLRLTNWRSAPELDNDYE
ARPANGWDVRAESWLPAWPHLGGKLVYEQYYGDEVA
[download]

Second, This script is giving me the output on the terminal. I want it to give me the output in a file. How and where do I declare the output file details?

Comment on Re^4: How to count the length of a sequence of alphabets and number of occurence of a particular alphabet in the sequence? Select or Download Code

Replies are listed 'Best First'.
Re^5: How to write to a file? by hippo (Bishop) on Oct 11, 2019 at 15:47 UTC
I want it to give me the output in a file. How and where do I declare the output file details? stevieb has already suggested that you should see open. Did you read that? Did you understand it? What about perlintro - have you read that? If not, you really should. It even contains some very simple examples of how to open and write to files.	[reply]
Re^6: How to write to a file? by davi54 (Sexton) on Oct 11, 2019 at 23:32 UTC
Yes, I did read that. And I am currently trying to implement it. But I am looking to solve the first question first. What can I do to fix that?	[reply]
Re^7: How to write to a file? by AnomalousMonk (Archbishop) on Oct 12, 2019 at 00:23 UTC
Assuming that your input file contains only FASTA header records and sequence records (i.e., no blank lines, comment lines, etc.), you might do something like (untested): my $rx_fasta_header = qr{ \A > }xms; # header pattern - very naive my $rx_sequence = qr{ \A [ACDEFGHIKLMNPQRSTVWY_]+ \z }xms; # plea +se check my $sequence; # sequence accumulator - initially undefined RECORD: while (my $record = <$read_filehandle>) { chomp $record; # get rid of newline, if any # fasta header - process any sequence accumulated so far if ($record =~ $rx_fasta_header) { # do not process if no sequence accumulated so far (i.e., firs +t header) next RECORD if not defined $sequence; my $seq_len = length $sequence; my $seq_n_A = $sequence =~ tr/A//;; do_something_with($sequence, $seq_len, $seq_n_A); undef $sequence; # prepare to process next sequence next RECORD; } # part of sequence - accumulate if ($record =~ $rx_sequence) { $sequence .= $record; # accumulate sub-sequences (records) next RECORD; } die "bad record: '$record'"; # don't know what this is } [download] (I'm not sure about the codon codes I've used to define `$rx_sequence`; please double-check this. (Update: Note, also, that the sequence letters I've used are all upper-case, but IIRC these can also be lower-case.)) Give a man a fish: `<%-{-{-{-<`	[reply] [d/l] [select]
Re^8: How to write to a file? by davi54 (Sexton) on Oct 15, 2019 at 17:32 UTC
Re^9: How to write to a file? by soonix (Canon) on Oct 15, 2019 at 19:27 UTC
Re^9: How to write to a file? by AnomalousMonk (Archbishop) on Oct 15, 2019 at 20:35 UTC
Some notes below your chosen depth have not been shown here


No such thing as a small change
	PerlMonks