Beefy Boxes and Bandwidth Generously Provided by pair Networks
Clear questions and runnable code
get the best and fastest answer
 
PerlMonks  

Re: regex with variable input

by spq (Friar)
on Aug 10, 2004 at 01:23 UTC ( #381425=note: print w/ replies, xml ) Need Help??


in reply to regex with variable input

First of all, does bork.embl-heidelberg.de offer any sort of batch download, or text based save that could be used?

I'm guessing your looking to do this in a (semi) automated manner, and starting from saving as text from your web browser wont help in the long run, since your scripting the fetch? I also might suggest that you may want to consider contacting the web site maintainers. They may not be thrilled about getting hammered by your script, and may be very open to helping find an easier way to provide this information to you instead.

That being said, I used wget to download the page and wrote this code which seems to capture what you want (this is a quick translation from a command line test, and in no way should be considered production ready or tested).

while (<>) { chomp; if(/CANDIDATE (\d+)/){ $cand = $1; @acc = () } # if candidate line elsif (m{http://www.ncbi.nlm.nih.gov:80/entrez/query.fcgi} && m{(\w+)</A>}) { push(@acc,$1); } # else if ncbi accession # elsif (m{R-score</A>\s*=\s*(\S+);}) { print "$cand\t", join(",",@acc), "\t$1\n"; } # else if R-score, print this candidates info } # while reading web page


Comment on Re: regex with variable input
Download Code

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://381425]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others about the Monastery: (7)
As of 2014-09-18 22:38 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    How do you remember the number of days in each month?











    Results (126 votes), past polls