Beefy Boxes and Bandwidth Generously Provided by pair Networks DiBona
Problems? Is your data what you think it is?
 
PerlMonks  

Help with program for proteins

by jemswira (Novice)
on Jan 05, 2012 at 01:39 UTC ( #946311=perlquestion: print w/ replies, xml ) Need Help??
jemswira has asked for the wisdom of the Perl Monks concerning the following question:

So I recently posted up this code on perlmonks asking for help, and got it to work properly(THANKS)

$/="\/\/"; our @acnumbers=qw(P0A252 Q9AT80 Q0HKB6); our $acnumbers; foreach $acnumbers(@acnumbers){ my $unit; foreach $unit(<PFAMDB>){ my @units= split /#/,$unit; my @pfx=grep(/=GF AC/,@units); our $units; foreach $units(@units){ if ($units=~/.*AC $acnumbers/){ push (@list, @pfx); }else{next} } } print "$acnumbers is in:"; print @list; undef @list; }

So what it did was to take this data

# STOCKHOLM 1.0

#=GF ID 1-cysPrx_C

#=GF AC PF10417.4

#=GF DE C-terminal domain of 1-Cys peroxiredoxin

...

#=GS D8BPP0_ECOLX/154-186 AC D8BPP0.1

#=GS D6I5T0_ECOLX/154-186 AC D6I5T0.1

...

//

...

find the PFxxxx number and whether the AC xxxxxx number was in it.

There's another file, which has many lines of:

>tr|A0A171|A0A171_PYRHR Glutamate synthase small subunit-like protein 1 OS=Pyrococcus horikoshii GN=gltY PE=4 SV=1

and i need to get the A0A171, and A0A171_PYRHR Glutamate synthase small subunit-like protein 1 out also, then run the first code. So i got this:

foreach (<DATABASE>){push (@uniprot,$_);} my $i; my $name; foreach $uniprot(@uniprot){ our $acc; my @splitted=split /\||=/,$uniprot; foreach $i(@splitted){ if ($i=~/\b\w{6}\b/ and $i !~/\_/){ $acc=$i; } elsif ($i=~ /.+(?= OS)/){ $name=$i; } else {next;} my $unit; $/="\/\/"; open PFAMDB, 'C:\Users\Jems\Desktop\Perl\PFAM.txt' or die $!; foreach $unit(<PFAMDB>){ my @units= split /#/,$unit; my @pfx=grep(/=GF AC/,@units); our $units; foreach $units(@units){ if ($units=~/.*AC $acc/){ push (@list, @pfx); }else{next} } } print "$acc, also called $name, is in:" print @list; undef @list; }}

however it no longer seems to get @list out. @pfx is still correct when i print tested, but it no longer pushes to @list. (I THINK)

so is it that i am doing anything wrong? also, how would i remove the OS at the back of $name, and the #=GF AC and everything after the fullstop in @pfx? Thanks in advance!

Comment on Help with program for proteins
Select or Download Code
Re: Help with program for proteins
by RichardK (Priest) on Jan 05, 2012 at 11:46 UTC

    Are you using strict and warnings?

    Where does $acc come from in

    if ($units =~ /.*AC $acc/)
      foreach $i(@splitted){ if ($i=~/\b\w{6}\b/ and $i !~/\_/){ $acc=$i;

      If the accession number in the first file matched one in a line of the second one, then I'd push @pfx into @list. at least that was the plan

Re: Help with program for proteins
by umasuresh (Hermit) on Jan 05, 2012 at 13:46 UTC

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://946311]
Approved by ww
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others drinking their drinks and smoking their pipes about the Monastery: (5)
As of 2014-04-19 02:05 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    April first is:







    Results (475 votes), past polls