Beefy Boxes and Bandwidth Generously Provided by pair Networks
Keep It Simple, Stupid
 
PerlMonks  

Re: scraping from HTTP page to MySql table

by chanio (Priest)
on May 03, 2004 at 18:21 UTC ( [id://350105]=note: print w/replies, xml ) Need Help??


in reply to scraping from HTTP page to MySql table

In order to know when to re-check the site for changes, you'ld rather ask its webmaster the hours when she renews the site. You could even suggest her to publish the changes at a newsfeed site (Sourceforge has it) like syndic8 *.

Then to get the notice of those news(changes) you should extract an XML file called RSS or RDF that specifies what articles have changed, or simple that you should re-check the site.

There are also PM to extract the RSS info from those files and even download them at a specified frequency:

see RSS at CPAN**.

(*) http://www.syndic8.com/

(**)http://search.cpan.org/search?mode=dist&query=RSS

{\('v')/}
_`(___)' __________________________
  • Comment on Re: scraping from HTTP page to MySql table

Replies are listed 'Best First'.
Re: ask the site's webmaster
by Agyeya (Hermit) on May 04, 2004 at 04:49 UTC
    Hi

    the site that i wish to be monitoring is a dynamic site. It may have details that are subject to random change. E.g consider the seat status in a train or bus. or even consider the appointment list of a doctor. Now on the site the list will be in the form of an excel table. Having fields, Patient ID, Appointment type, Appointment date, appointment time.

    Now suppose that a patient wants an appointment. so instead of putting him at the end of the queue, we can check the appointment list for any random cancellations, at put the patient in that slot.(this is just an example, as obviously the next patient in the queue shoukd be advanced). But considering how people have divded their own time in slots. the free time of the patient shuld match that of the vacancy in the appointment list.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://350105]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others admiring the Monastery: (5)
As of 2024-04-24 05:39 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found