Beefy Boxes and Bandwidth Generously Provided by pair Networks
Come for the quick hacks, stay for the epiphanies.
 
PerlMonks  

Re: Extract the matching strings

by poulhs (Beadle)
on Nov 11, 2010 at 23:18 UTC ( [id://870952]=note: print w/replies, xml ) Need Help??


in reply to Extract the matching strings

something like:

LINE: while ( <PROTEIN> ) { if ( /^VERSION\s+(\S+)/ ) { # extracts the first non-space sequence after the VERSION-token $protname = $1; next LINE; } if ( /^DBSOURCE\s+.*\s(\S+)\s*$/ ) { # extracts the last non-space sequence on the DBSOURCE line $rna = $1; next LINE; } }
You first need to determine the syntax of the lines you want, and the location of the values you want to extract:
Is the "YP_001648463.1" always the first field after the VERSION? Is the RNA always last on the DBSOURCE lines?
You should consider what to do with invalid input: what happens if DBSOURCE is not present or $protname does not match the ACCESSION-value...
For more info on regular expressions, check out perldoc perlre.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://870952]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others perusing the Monastery: (4)
As of 2024-04-19 14:40 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found