|Perl: the Markov chain saw|
Searching file and printingby jemswira (Novice)
|on Dec 30, 2011 at 15:17 UTC||Need Help??|
jemswira has asked for the
wisdom of the Perl Monks concerning the following question:
Ok so for this research project, I have a file, with data arranged like so:
# STOCKHOLM 1.0
#=GF ID 1-cysPrx_C
#=GF AC PF10417.4
#=GF DE C-terminal domain of 1-Cys peroxiredoxin
#=GS D8BPP0_ECOLX/154-186 AC D8BPP0.1
#=GS D6I5T0_ECOLX/154-186 AC D6I5T0.1
It's basically proteins and functional groups. The functional groups are the ones in #=GF AC PFxxxx, and the proteins are the ones with #=GS D8BPP0.
so the list would have like, D8BPPO is in groups :PFxxxxx etc etc
I thought i would put the list of proteins into an array (they're in a big file) and then I'd put each protein into a scalar. Then I'd read the 2nd file, with all the data up there, with $/="\/\/"; and then split it using #. Then i'd check if it was the functional group using the grep function, then check if the protein was in the functional group. if it was, then i'd push the functional group into an array, and then at the end of the loop i'd print it out, and then go on to the next protein.
example with simplified list of proteins:
But all i get is
P0A252 is in:=GF AC PF10417.4
Q9AT80 is in:Q0HKB6 is in:
how should i improve it?
sorry for the messiness but i really just learned perl