Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl Monk, Perl Meditation
 
PerlMonks  

Re^8: calculation of charged amino acids

by yuvraj_ghaly (Sexton)
on Jul 24, 2013 at 10:01 UTC ( [id://1046046]=note: print w/replies, xml ) Need Help??


in reply to Re^7: calculation of charged amino acids
in thread calculation of charged amino acids

yeah my fasta file doesnot contain spaces. Thanks...Now it is working :)

  • Comment on Re^8: calculation of charged amino acids

Replies are listed 'Best First'.
Re^9: calculation of charged amino acids
by mtmcc (Hermit) on Jul 24, 2013 at 10:09 UTC
    Thanks be to St. Larry.

      I aim analyzing the output and I found that the program is n't showing the output result from the start. I show any of the sequence randomly. I want program to show output starting from the first sequence but this is somewhat taking random sequence from the file.

      this are the fasta sequences which are in fasta file. The problem is it is not showing result from the beginning.

      >gi|226694487|sp|Q1DF98.2|ATPF_MYXXD RecName: Full=ATP synthase subuni +t b; AltName: Full=ATP synthase F(0) sector subunit b; AltName: Full= +ATPase subunit I; AltName: Full=F-type ATPase subunit b; Short=F-ATPa +se subunit b MFLPSVLAASNLVKVQPGLIFWTLVTFVIAAVVLKWKAWGPILSLVEEREKQIASSIESAKRERAEAEKL LADQKTAIAEARREAAEMMRRNTQEMEKFREELMAKSRKEAEELKLSARREIDEQKAKAIAEVRSMAVDL AMEVAGKLISERMDDSKQRALAEQFVQGLPLNSTSATGAVRRTA<br> >gi|172046103|sp|Q1D4N0.2|LIPA_MYXXD RecName: Full=Lipoyl synthase; Al +tName: Full=Lip-syn; Short=LS; AltName: Full=Lipoate synthase; AltNam +e: Full=Lipoic acid synthase; AltName: Full=Sulfur insertion protein +LipA MTETTRKPEWLKVRLPHGEGYERVKAIVKRTKLATVCEEARCPNIAECWGGGTATVMLMGEVCTRACRFC HVKVGAPPPLDPMEPIHLAQAVKEMDLEYIVVTSVNRDDRPDGGASHFASAIRELRRESPRTIVEVLIPD FKGVEKDLTTVAEAKPHVVAHNVETVERLTPTVRDRRAKYHQSLRVLEYLKNRPEGLYTKTSVMVGLGET DAELEQTFKDLRDVGVDVLTLGQYLQPSQYHLRVERFVTPAQFEAYKTLAESYGFLYVASGPLVRSSYRA AEFFMKGLMERERLERLG<br> >gi|123374798|sp|Q1DDB3.1|KDSB_MYXXD RecName: Full=3-deoxy-manno-octul +osonate cytidylyltransferase; AltName: Full=CMP-2-keto-3-deoxyoctulos +onic acid synthase; Short=CKS; Short=CMP-KDO synthase MQSCRTVAVIPARHASTRFPGKPLAIIAGRTMIEHVWRRCQEAQAFDEVWVATDDDRIRAAVEGFGGKAV MTSPACATGTDRVAEVALGRPDIDIWVNVQGDEPLVDPATLQRLAGLFQDASVRMGTLVRPLEADEAASP HVVKAVLALNGDALYFSRSLVPHVREPGTPVQRWGHIGLYGYRREVLLSLAKLAPTPLEDAEKLEQLRAL EHGIPIRCAKVTSHTVAVDLPGDVEKVEALMRARGG<br> >gi|123374766|sp|Q1DCG7.1|FMT_MYXXD RecName: Full=Methionyl-tRNA formy +ltransferase MSRPRIVFMGTPEFAVSSLAACFELGDVVAVVTQPDKPKGRGNTVTAPPVKELALSRGVPVLQPTKLRTP PFAEELRQYAPDVCVVTAYGRILPKDLLELPTHGCVNVHGSLLPRFRGAAPIQWAIAHGDTETGVSLMVM DEGLDTGPVLAMKRMAIAPDETSASLYPKLAALGGEVLREFLPAYLSGELKPVPQPSEGMVLAPIIEKDQ GRLDFTKPAVELERRLRAFTPWPGAFTTLGGKLLKVHRAQARGGSGAPGTVLASGPDGIEVACGEGSLVL LDLQPEGKRVMRAADFLQGHKLAPGSQPFVAG<br> >gi|123374695|sp|Q1DAN7.1|SSRP_MYXXD RecName: Full=SsrA-binding protei +n MTSGGKSKGVGSEPGVRVIAENRRARFDYTVDEKVEAGLALTGSEVKSLRDGIANLSDAYALPKGDELFL LNANIGSYKAASFFDHLPTRGRKLLMHRGEIDRWTAKVRERGYSIIPLVLYFRNGRAKVELGLCRGKTHE DRRHDIKERETKREMDRAMRRR<br> >gi|123374693|sp|Q1DAM1.1|PNP_MYXXD RecName: Full=Polyribonucleotide n +ucleotidyltransferase; AltName: Full=Polynucleotide phosphorylase; Sh +ort=PNPase MLKKSVKIGESELSIEVGRLAKQADGSVVVRYGDTMLLVTAVSAREKKDIDFLPLTVEYQEKLYSAGRIP GSYFKREGRLTEKETLASRLVDRSCRPLFPEGYAYETQIIASVISSDPENEGDIHGITGASAALWVSDIP FDGPIAGIRVGRVGGQLVANPTAKQREQSDLDLVMAVSRKAIVMVEGGAEEVSEADMVAALDFGFTTAQP ALDLQDELRRELNKQVRSFEKPAAVDEGLRAKVRELAMDGIKAGYGIKEKGARYEALGKTKKEALAKLKE QLGDGYTPLVEKHAKAVVEDLKYEHMREMTVNGGRIGDRGHDVVRSITCEVGVLPRTHGSAVFTRGETQA LVVTTLGTSDDEQRLEMLGGMAFKRFMLHYNFPPFSVNETKPLRGPGRREVGHGALAERALRNMVPKSES FPYTVRLVSDILESNGSSSMASVCGGTLALMDAGVPLKAPVAGIAMGLVKEGDKIAILSDILGDEDHLGD MDFKVCGTSKGITSIQMDIKITGLTTEIMSRALEQARQGRLHILGEMLKTLAESRKEISQYAPRITTIQI RPEFIKNVIGPGGKVIKDIIARTGAAINIEDSGRVDIASANGEAVKAAIAMIQALTREAEIGKIYTGTVR KIAEFGAFVELFPGTDGLIHISELSDKRVKSVSDVLNEGDEVLVKVVSIDKTGKIRLSRKEAMAERAAQQ GAAAGEAAAQPAPAPTQPDAKA<br> >gi|123374596|sp|Q1D8K2.1|SYM_MYXXD RecName: Full=Methionine--tRNA lig +ase; AltName: Full=Methionyl-tRNA synthetase; Short=MetRS MAERTLVTSALPYANGPLHIGHAVEYVQTDIYVRFLRSCGRDVVYFCADDTHGTPIELNAAKQGLKPEEF IARFHEEHQRDFHDLDVRFDYFHSTNSPENRQYAELIYGRLKEKGDIERRNIEQTYCENDRRFLPDRFIK GTCPNCKASDQYGDACEKCGKAYDPTDLIDARCALCGTPPVRKHSEHLFFKLSRHEDFLQDVLRKPGFIH PGLATQLQGFFEKGLSDWDISRDGPYFGFAIPGETDKYFYVWLDAPIGYIATTEKWAKETGKAKSALDYW SADADTRIIHFIGKDIVYFHALFWPAVLNVAGFHIPSEIKVHGHLMLNGEKMSKTRGTMVPVRDYLDQLD PSYLRYFYAANLGPGVEDLDLNLKDFRQRVNGELVNNVGNLANRALSLLAGPLEKRLAPGRAEGPGRELV EAALARVPEVRDAFDKLEYRSAIRVITEIASAANGFLQTAAPWAQVKKDAEVARADLSDAADVAYLLGAL LAPVTPRLSEKLFAQLGAEPLTFQALEGAKYPLLDRSRPIGTPEPLLPRLEEERVNAIIKLPEGAAAPQA AEARPAKGAKTEKKPAEAPAATQAAAPSAGAAESPGEIEYDDFAKVVLKAGKVLAAEKVQKADKLLKLTV DVGEGSPRTIVSGIAEAFAPEALTGRNVVVVANLKPRKLKGIESRGMLLTAGPGGKELSLLDPGDVAPGS EVK<br> >gi|123374566|sp|Q1D7X3.1|GCSH_MYXXD RecName: Full=Glycine cleavage sy +stem H protein MADNIPGDLKYTREHEWARVQGTSVVVGVTQHAQESLGDVVYVELPKVGSTVTEGKQFGVIESTKAVSEL YSPLTGKVVKVNDGLSDNPSTVNTDPYGAGWIVEIEPSDPKQVDGLMDAAAYTALLQNS<br> >gi|123374515|sp|Q1D773.1|RL23_MYXXD RecName: Full=50S ribosomal prote +in L23 MNLNDVIKGPLITEKLDKAREKFRQYSFIVDRKATKHDVARAVETLFKVTVEGVNTNIVRGKIKRVGRSI GKRPNFKKAVVTLKQGDSIELFEGGAA<br> >gi|123374507|sp|Q1D751.1|RS13_MYXXD RecName: Full=30S ribosomal prote +in S13 MARIAGIDLPPNKRAVISLQYIYGIGNKSAQDIIAAAGIDPTTRTKDLTEEQARKIREIIEASYKVEGDL RREVTMNIKRLMDLGCYRGLRHRKGLPVRGQRTHTNARTRKGPKRGIVRAKPAAPAR<br> >gi|123374498|sp|Q1D6W9.1|PIMT_MYXXD RecName: Full=Protein-L-isoaspart +ate O-methyltransferase; AltName: Full=L-isoaspartyl protein carboxyl + methyltransferase; AltName: Full=Protein L-isoaspartyl methyltransfe +rase; AltName: Full=Protein-beta-aspartate methyltransferase; Short=P +IMT MGDWGRADYLSRHGIKDARVLEAIARLNRADFVPEDLREEASADSPLPIGHGQTISQPYVVALMTEALQL QGDERVLEIGTGSGYQTALLSLLCREVYSVEIVPELAQSAREVLGRQGFENVSFREGDGSLGWPDQAPFD AILAAAAPPDVPLQLLSQLKPGGRMIIPVGPRGGTQQLLRIQRALRPGEVPQVESLLSVRFVPMTGQPLS QG<br> >gi|123374481|sp|Q1D6N0.1|MNMA_MYXXD RecName: Full=tRNA-specific 2-thi +ouridylase MnmA MRVVVAMSGGVDSSAAAALLKEQGHEVIGITLRVWSYEGKATCGSCCSPDDIDDARAVAQTLGIPFYVAN AEEIFQDRVINPFVQSYLGGRTPIPCVACNRDVKFNFLLKRARALGARLATGHYARVEEVDGRFVLRRAV DAAKDQSYFLFTLGQDELRDILFPVGGMTKAEVRAVAERHGLVTSQKPESMEICFVPDGDYAGFVEKVAG PQPAGDIVDTEGNVLGTHQGIHRYTVGQRKGLNLGGGEIRYVHRLEPETQRVVVGPAEGTGRDNFGLLQP HWVDGPPPASQPVEVRIRHRHSGAQGRVHVSPHGLVSVKLDAPARAVTPGQAAVVYDQDRVLGGGWIV<b +r> >gi|123374426|sp|Q1D5J6.1|MUTS_MYXXD RecName: Full=DNA mismatch repair + protein MutS MNEGAGAREIASLTPMMRQYMEVKALHPDSLLFFRLGDFYEMFFEDAVKASEILQITLTARSKGADKVPM CGVPYHAARRYIGRLVSEGLKVAICEQVEEPGNGPGIVRREVTRVITPGMVLDEEVLEPQASNFLAAVSW NDKGWGAALLEASTGEFMALEAPGIAELAESLSRVEPRELLVPDGKRDAPEVAQLLARLVRTPAVAEGEA ASFEPTRAAGYLRSHFAVQSLSAFGLDDAPLAAGAAGAALRYLKDTQKTAAAHVDRLSRQERGGNLLMDE SSRANLEVLRSLRDGGRKGSLLGVLDKTVTSLGARKLARWLASPLGSLPEIHARLDAVEELSGRSVWREE LAGILKEVGDLERLCGRLSLGAGNARDLRALGLSLAQLPRVVAVLARCESPLLKSLTGPLSALPELAELL SRAVAEEPPVTLKDGGMIRAGFHAELDKLVALSTSGKDLLLQIEQREKERTGISSLKVRYNKVFGYYLEV TKSNLDRVPKDYIRKQTTVNSERFVTPELKEYEEQVLTAEERRCALEIQLFEELRAQVVSAAPRIRSAAE AVATGDALLSFARCAAEYGYTRPEVDASVALSITAGRHPVVERMLGAGDSFVPNDVRLDPAEDAQLMVIT GPNMAGKSTVMRQVALTALMAQAGSFVPAKAARIGLCDRIFTRVGAADNLARGQSTFMVEMTETSHILHH ATNKSLIILDEIGRGTSTFDGLSIAWAVAEHLHDTVGARALFATHYHELVDLARERPRVKNLCVAVKEQN GKVIFLRKLVPGGASRSYGIEVAKLAGLPPEVVGRARELLQNLESGELDDAGRPRVAVRQPQGGRRGAST GQLGLFGMEPAQGGTGVTPAQQKALDALKGASIDRMTPLDALNLLAKLQRELE<br> >gi|123374387|sp|Q1D4R3.1|NADD_MYXXD RecName: Full=Probable nicotinate +-nucleotide adenylyltransferase; AltName: Full=Deamido-NAD(+) diphosp +horylase; AltName: Full=Deamido-NAD(+) pyrophosphorylase; AltName: Fu +ll=Nicotinate mononucleotide adenylyltransferase; Short=NaMN adenylyl +transferase MRPAVQVALLGGSFNPPHVGHLMAATYVHATQDVDEVWLMPSWQHPFGKQMEPFEHRVAMCDALCAETSG WLKTSRIEQEPGLSGRTVDTLTLLVARHPDIRWSIIIGSDILRDLPHWKDFHRIEELSRVMVLNRAGYPA PNTLGPPLAEVSSTLIRDLLARGEAPSDLVPARAIAYAREHGLYGLKRTP<br> >gi|123374295|sp|Q1D375.1|YBEY_MYXXD RecName: Full=Endoribonuclease Yb +eY MSGARRGNGVRLRKGKLIPRDDGKRIEEFVGAATTSTDSASVARMLAPPGWSEPAQRPEFDEVVIVLTGE LTIVVEGRRERISAGEVGLVPRGKRVVYRNDGQGACDYWSVCAPAFRPELAHMETPKPRVQENHVTIQVA HGQGRDFARLLTTWARAYLVQLELSGVELSLSLVDDRAIRRLNRTWRKKDKATDVLSFPAGDLPKGTPGP RPLGDVVISLDTAKRQAKEYGRTLESEMARYLAHGLLHLLGHDHERPRDAKRMAALEEQLLGERGMVADS LQVDAKARRARSLM<br> >gi|123374287|sp|Q1D339.1|PLSX_MYXXD RecName: Full=Phosphate acyltrans +ferase; AltName: Full=Acyl-ACP phosphotransacylase; AltName: Full=Acy +l-[acyl-carrier-protein]--phosphate acyltransferase; AltName: Full=Ph +osphate-acyl-ACP acyltransferase MRLVLDAMGGDHAPAAPVEGGVLFARAHPGHEVLLVGDEAKVAPLLGKLRPPSNLQVHHASEVVEMDEHA STAFRRKRDSSLRVGFELVRDGRAEALVSAGNSGAVMAGGLLTLGRLPGVERPAIAALFPALKGGGRCLL LDAGANVDCKPTHLAQFAVMGEAYVRARMGVARPRVAVLSNGEESSKGTPLTREASGLLRRSDLDFVGYV EGKDLFSGEVQVVVTDGFTGNVVLKTSEGVGMGVIGMLRQAIERRGGLAEKVGAMLLQPALAGLRRVVDY AEYGGAPLLGIQGVGIVAHGRSTPRALFNALGAALAMAEGGVQAELTRCIGRAAAWLPTHPKGKRATDAG VSD<
        Dude.

        I'm not a code writing service.

        Please read How do I post a question effectively?

        This is very likely the last update I'll ever write for this. Ever.

        #!/usr/bin/perl -w use strict; use warnings; my $file = $ARGV[0]; my @currentProtein; my %protein; open (FASTA, "<", $file) || die "Can't open $file\n"; my $fastaLine; my $protSequence; my @fastaTag; my @tagOrder; while (<FASTA>) { print STDERR "CURRENT: $_"; chomp; if (($_ !~ /^>/) && ($_ =~ /\w/)) { push (@currentProtein, $_); } if ((/^>/) || (eof)) { if ((@currentProtein > 0) || (eof)) { $protSequence = join("", @currentProtein); $protSequence =~ s/ //g; @fastaTag = split(" ", $fastaLine); $protein{$fastaTag[0]} = $protSequence; push (@tagOrder, $fastaTag[0]); @currentProtein = (); } $fastaLine = $_ if $_ =~ /\w/; } } close FASTA; for(@tagOrder) { my $count_of_acidic = 0; my $count_of_basic = 0; my $count_of_neutral = 0; my $aa; my $sequence = "$protein{$_}"; $sequence =~ s/\s//g; my @prot=split("",$sequence); #splits string into an array #print " \nThe original PROTEIN file is:\n$sequence \n"; while(@prot) { $aa = shift (@prot); if($aa =~/[DNEQ]/ig) { $count_of_acidic++; } if($aa=~/[KRH]/ig) { $count_of_basic++; } if($aa=~/[DNEQKRH]/ig) { $count_of_neutral++; } } print "\nName: $_\n"; print "Number of acidic amino acids:".$count_of_acidic."\n"; print "Number of basic amino acids:".$count_of_basic."\n"; print "Number of neutral amino acids:".$count_of_neutral."\n"; }

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://1046046]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others admiring the Monastery: (5)
As of 2024-04-16 06:57 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found