FamousLongAgo has asked for the wisdom of the Perl Monks concerning the following question:
Hello, fellow monks!
I have been writing a parser for some Protein Data Bank files, for a bioinformatics project. I have no problem extracting the sequences I need, but I am stumped by the titles. Here's the problem:
The files start out in this format:
So far so dull. But later in the file, sometimes much later, there may be lines that also begin with TITLE. We want to ignore those.
Assuming the following constraints:
I know how to do this with regular expressions on a scalar, and how to do it in a very unelegant way by setting flags in a loop, but I suspect there is greater wisdom out there and can't wait to learn.
Special bonus to anyone who can tell me what an agkistrodon acutus is, and how deadly is its bite.
I have been writing a parser for some Protein Data Bank files, for a bioinformatics project. I have no problem extracting the sequences I need, but I am stumped by the titles. Here's the problem:
The files start out in this format:
The lines beginning with TITLE are the ones I'm interested in grabbing. There's a little caveat in that after the first line, the line number gets prepended to the title fragment. So in this example, the actual title is "Acutolysin A from snake venom of agkistrodon acutus at pH 7.5".HEADER METAL BINDING PROTEIN 31-AUG-98 1BSW + TITLE ACUTOLYSIN A FROM SNAKE VENOM OF AGKISTRODON ACUTUS AT PH + TITLE 2 7.5 + COMPND MOL_ID: 1; + COMPND 2 MOLECULE: ACUTOLYSIN A; ...
So far so dull. But later in the file, sometimes much later, there may be lines that also begin with TITLE. We want to ignore those.
Assuming the following constraints:
- We treat the file as an array ( no slurping into a scalar )
- There is no way to distinguish the later TITLE elements by pattern matching.
I know how to do this with regular expressions on a scalar, and how to do it in a very unelegant way by setting flags in a loop, but I suspect there is greater wisdom out there and can't wait to learn.
Special bonus to anyone who can tell me what an agkistrodon acutus is, and how deadly is its bite.
|
---|
Replies are listed 'Best First'. | |
---|---|
Re: Finding first block of contiguous elements in an array
by dws (Chancellor) on Dec 21, 2002 at 06:10 UTC | |
Re: Finding first block of contiguous elements in an array
by hossman (Prior) on Dec 21, 2002 at 06:05 UTC | |
Re: Finding first block of contiguous elements in an array
by tachyon (Chancellor) on Dec 21, 2002 at 06:10 UTC | |
Re: Finding first block of contiguous elements in an array
by Aristotle (Chancellor) on Dec 21, 2002 at 12:21 UTC | |
Re: Finding first block of contiguous elements in an array
by Arien (Pilgrim) on Dec 21, 2002 at 16:29 UTC | |
by BrowserUk (Patriarch) on Dec 21, 2002 at 16:36 UTC |
Back to
Seekers of Perl Wisdom