Beefy Boxes and Bandwidth Generously Provided by pair Networks
"be consistent"
 
PerlMonks  

perlkhan 77

by perlkhan77 (Acolyte)
on Apr 22, 2009 at 11:14 UTC ( #759243=perlquestion: print w/ replies, xml ) Need Help??
perlkhan77 has asked for the wisdom of the Perl Monks concerning the following question:

open(FILE,"C:/Users/hp/Desktop/UNIQ.fasta"); $i = 0; while($line = <FILE>) { chomp $line; if($line =~/(\>gi\|\d{1,})/) { if($i >= 1) { $hash{$def[$i]} = $list[$i]; } $i++; $def[$i] = $line; } else { $list[$i] .= $line; } } open(FILE,">C:/Users/hp/Desktop/Amino_acid_count_data.txt"); print FILE "DEFINITION\tLENGTH\tG_RATIO\tE_RATIO\tD_RATIO\tA_RATIO\tV_RATIO\ +tR_RATIO\ tS_RATIO\tK_RATIO\tN_RATIO\tT_RATIO\tM_RATIO\tI_RATIO\tQ_RATIO\tH_RATI +O\tP_RATI O\tL_RATIO\tW_RATIO\tC_RATIO\tY_RATIO\tF_RATIO\tMiss_RATIO\n"; $o = 0; foreach $k (keys %hash) { $o++; #print "$k => $hash{$k}\n"; Passref($k,$hash{$k},$o); } ###########################sub boundary################ sub Passref { my($define,$seq,$count) = @_; chomp ($define); print $define,"\n"; print $seq,"\n"; print $count,"\n"; $len = length ($seq); #print "\n$len\n"; $count_G = ( $seq =~ tr/G//); #print "$count_G\n"; $count_E = ( $seq =~ tr/E//); $count_D = ( $seq =~ tr/D//); $count_A = ( $seq =~ tr/A//); $count_V = ( $seq =~ tr/V//); $count_R = ( $seq =~ tr/R//); $count_S = ( $seq =~ tr/S//); $count_K = ( $seq =~ tr/K//); $count_N = ( $seq =~ tr/N//); $count_T = ( $seq =~ tr/T//); $count_M = ( $seq =~ tr/M//); $count_I = ( $seq =~ tr/I//); $count_Q = ( $seq =~ tr/Q//); $count_H = ( $seq =~ tr/H//); $count_P = ( $seq =~ tr/P//); $count_L = ( $seq =~ tr/L//); $count_W = ( $seq =~ tr/W//); $count_C = ( $seq =~ tr/C//); $count_Y = ( $seq =~ tr/Y//); $count_F = ( $seq =~ tr/F//); $count_Miss = ( $seq =~s/[^YCWLPHQIMTNKSRVADEGF]//ig); #print "$count_Miss\n"; $ratio_G = ($count_G/$len); #print "\n$ratio_G\n"; $ratio_E = ($count_E/$len); $ratio_D = ($count_D/$len); $ratio_A = ($count_A/$len); $ratio_V = ($count_V/$len); $ratio_R = ($count_R/$len); $ratio_S = ($count_S/$len); $ratio_K = ($count_K/$len); $ratio_N = ($count_N/$len); $ratio_T = ($count_T/$len); $ratio_M = ($count_M/$len); $ratio_I = ($count_I/$len); $ratio_Q = ($count_Q/$len); $ratio_H = ($count_H/$len); $ratio_P = ($count_P/$len); $ratio_L = ($count_L/$len); $ratio_W = ($count_W/$len); $ratio_C = ($count_C/$len); $ratio_Y = ($count_Y/$len); $ratio_F = ($count_F/$len); $ratio_Miss = ($count_Miss/$len); open(FH,">>C:/Users/hp/Desktop/Amino_acid_count_data.txt"); print FH "$define\t$len\t$ratio_G\t$ratio_E\t$ratio_D\t$ratio_A\t$ratio_V\t$ +ratio_R\t $ratio_S\t$ratio_K\t$ratio_N\t$ratio_T\t$ratio_M\t$ratio_I\t$ratio_Q\t +$ratio_H\ t$ratio_P\t$ratio_L\t$ratio_W\t$ratio_C\t$ratio_Y\t$ratio_F\t$ratio_Mi +ss\n"; #print "$define\t$len\t$ratio_G\t$ratio_E\t$ratio_D\t$ratio_A\t$ratio_ +V\t$ratio _R\t$ratio_S\t$ratio_K\t$ratio_N\t$ratio_T\t$ratio_M\t$ratio_I\t$ratio +_Q\t$rati o_H\t$ratio_P\t$ratio_L\t$ratio_W\t$ratio_C\t$ratio_Y\t$ratio_F\t$rati +o_Miss\n" ; }

Comment on perlkhan 77
Download Code
Re: perlkhan 77
by marto (Chancellor) on Apr 22, 2009 at 11:19 UTC

    Where is the question? You have just posted some Perl code, no description of the problem, your input or your output.

    The question title is just your user name, and not descriptive of your problem, and your closing code tag should be </code> and not <code>.

    Please read Writeup Formatting Tips and How do I post a question effectively?.

    Martin

Re: perlkhan 77
by dHarry (Abbot) on Apr 22, 2009 at 11:45 UTC

    Your code is full of mistakes/"sins". If you want to work with fasta files please take a look at fasta on CPAN. It might be a better idea to use an existing module to work with fasta files than to roll your own.

    Some general comments:

    • check for the success of open, don't assume everything is fine.
    • use the the 3-arg form of open, see open.
    • put use strict; and use warnings; in your code. It would catch most of the errors in your script.

Re: perlkhan 77
by apl (Monsignor) on Apr 22, 2009 at 12:06 UTC
    In addition to the earlier comments, I'd suggest changing (as an example)
    $count_G = ( $seq =~ tr/G//); $ratio_G = ($count_G/$len); # print "\n$ratio_G\n";
    to
    $ratio_G = Calculate( 'G' );
    Writing the sub Calculate is left as an exercise to the reader. The advantages are you
    • get rid of all of the temporary $count_* variables
    • remove duplicated code
    • make future changes (e.g. you want percentage rather than ratio) more centralized
    • make the intentions of your code much clearer
      sub Passref { my ( $define, $seq, $count, $outfh ) = @_; chomp($define); my $len = length($$seq); my %count = ( Miss => 0+( $$seq =~ m/^YCWLPHQIMTNKSRVADEGF/ig ) ); # YCWLPHQIMTNKSRVADEGF doesn't appear in test file my %ratio = ( Miss => 0+( $count{Miss} / $len ) ); for my $letter (qw[ E D A V R S K N T M I Q H P L W C Y F ]) { $count{$letter} = ( $$seq =~ m/$letter/g ); $ratio{$letter} = $count{$letter} / $len; } print {$outfh} join "\t", $define, $len, @ratio{qw[ E D A V R S K N T M I Q H P L W C Y F Miss ]}, "\n"; }
        For count you need
        $count{$letter} = ()= $$seq =~ m/$letter/g;
Re: perlkhan 77
by perliff (Monk) on Apr 24, 2009 at 07:47 UTC
    People have raised several issues in your code already, I will not repeat them, but will add some more that may be of help. You are much better off using Bioperl to read fasta files. You will be able to do much more and easily with bioperl. And as another gentle piece of advice is to learn ( say from this site or the Bioperl site ) on how to format your code so it doesnt look like its being pressed against a wall. People like to see lean code... but not lean in the way you have here...
    ----------------------

    "with perl on my side"

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://759243]
Approved by marto
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others imbibing at the Monastery: (7)
As of 2014-08-23 06:40 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    The best computer themed movie is:











    Results (172 votes), past polls