Beefy Boxes and Bandwidth Generously Provided by pair Networks
more useful options
 
PerlMonks  

Re: Re: Re: questions concerning using perl to monitor webpages

by TVSET (Chaplain)
on May 22, 2003 at 01:17 UTC ( #259959=note: print w/ replies, xml ) Need Help??


in reply to Re: Re: questions concerning using perl to monitor webpages
in thread questions concerning using perl to monitor webpages

The first is assuming the new page you fetch will not be served up from cache some place.

That's not the problem of my solution. :) It should have access to two copies of the site from different times to compare them. :) Validating that content was not supplied from the cache or something, is either user's headache or yet another addon to the script. :)

It would also be a problem if the overall content of the page was the same, but something like the <date> was different every day. Of course, this can be argued both ways, but one must assume that changed is subjective and not objective.

Well, that was one of the reasons I suggested the use of Text::Diff from the very beginning, since it will minimize the headache. You'll be able to quickly grep away things like dates. :)

I would probably roll my own very much like you have suggested. Since the number of pages to track could get large, I would probably store the MD5 sum and the URL in a database and that's it.

You could always start away with the hash like:

my %internet = ( 'url' => 'md5 checksum', );

Thanks for the feedback anyway. :)

Leonid Mamtchenkov aka TVSET


Comment on Re: Re: Re: questions concerning using perl to monitor webpages
Download Code

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://259959]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others having an uproarious good time at the Monastery: (10)
As of 2014-08-20 18:36 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    The best computer themed movie is:











    Results (121 votes), past polls