Beefy Boxes and Bandwidth Generously Provided by pair Networks
No such thing as a small change
 
PerlMonks  

Re^4: How to count the length of a sequence of alphabets and number of occurence of a particular alphabet in the sequence?

by davi54 (Sexton)
on Oct 11, 2019 at 15:05 UTC ( [id://11107342]=note: print w/replies, xml ) Need Help??


in reply to Re^3: How to count the length of a sequence of alphabets and number of occurence of a particular alphabet in the sequence?
in thread How to count the length of a sequence of alphabets and number of occurence of a particular alphabet in the sequence?

Hello, I am really sorry for not being specific. I should have displayed what I was trying. So here it is:
#!/usr/bin/env perl use 5.010; for (<>) { if (/^>/) { # Header } elsif (/^[A-Z]+$/) { # Protein my $a = tr/A/A/; say "A: $a, length: " . length; } } ~

There are two issues I am facing right now. First, some of the sequence entries in the input file are long and are continued on the next line (see below for example). But this script reads only the first line (before moving on to the second entry) due to which I'm getting wrong values for the length and number of 'A's that I want. Is there a way to fix this?

Example sequence:
>sp|P76347|YEEJ_ECOLI Uncharacterized protein YeeJ OS=Escherichia coli + (strain K12) OX=83333 GN=yeeJ PE=3 SV=3 MATKKRSGEEINDRQILCGMGIKLRRLTAGICLITQLAFPMAAAAQGVVNAATQQPVPAQ IAIANANTVPYTLGALESAQSVAERFGISVAELRKLNQFRTFARGFDNVRQGDELDVPAQ VSEKKLTPPPGNSSDNLEQQIASTSQQIGSLLAEDMNSEQAANMARGWASSQASGAMTDW LSRFGTARITLGVDEDFSLKNSQFDFLHPWYETPDNLFFSQHTLHRTDERTQINNGLGWR HFTPTWMSGINFFFDHDLSRYHSRAGIGAEYWRDYLKLSSNGYLRLTNWRSAPELDNDYE ARPANGWDVRAESWLPAWPHLGGKLVYEQYYGDEVA
Second, This script is giving me the output on the terminal. I want it to give me the output in a file. How and where do I declare the output file details?
  • Comment on Re^4: How to count the length of a sequence of alphabets and number of occurence of a particular alphabet in the sequence?
  • Select or Download Code

Replies are listed 'Best First'.
Re^5: How to write to a file?
by hippo (Bishop) on Oct 11, 2019 at 15:47 UTC
    I want it to give me the output in a file. How and where do I declare the output file details?

    stevieb has already suggested that you should see open. Did you read that? Did you understand it?

    What about perlintro - have you read that? If not, you really should. It even contains some very simple examples of how to open and write to files.

      Yes, I did read that. And I am currently trying to implement it. But I am looking to solve the first question first. What can I do to fix that?

        Assuming that your input file contains only FASTA header records and sequence records (i.e., no blank lines, comment lines, etc.), you might do something like (untested):

        my $rx_fasta_header = qr{ \A > }xms; # header pattern - very naive my $rx_sequence = qr{ \A [ACDEFGHIKLMNPQRSTVWY_]+ \z }xms; # plea +se check my $sequence; # sequence accumulator - initially undefined RECORD: while (my $record = <$read_filehandle>) { chomp $record; # get rid of newline, if any # fasta header - process any sequence accumulated so far if ($record =~ $rx_fasta_header) { # do not process if no sequence accumulated so far (i.e., firs +t header) next RECORD if not defined $sequence; my $seq_len = length $sequence; my $seq_n_A = $sequence =~ tr/A//;; do_something_with($sequence, $seq_len, $seq_n_A); undef $sequence; # prepare to process next sequence next RECORD; } # part of sequence - accumulate if ($record =~ $rx_sequence) { $sequence .= $record; # accumulate sub-sequences (records) next RECORD; } die "bad record: '$record'"; # don't know what this is }
        (I'm not sure about the codon codes I've used to define $rx_sequence; please double-check this. (Update: Note, also, that the sequence letters I've used are all upper-case, but IIRC these can also be lower-case.))


        Give a man a fish:  <%-{-{-{-<

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://11107342]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others avoiding work at the Monastery: (3)
As of 2024-04-20 02:52 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found