Beefy Boxes and Bandwidth Generously Provided by pair Networks
P is for Practical

Re: Re: Re: questions concerning using perl to monitor webpages

by TVSET (Chaplain)
on May 22, 2003 at 01:17 UTC ( #259959=note: print w/replies, xml ) Need Help??

in reply to Re: Re: questions concerning using perl to monitor webpages
in thread questions concerning using perl to monitor webpages

The first is assuming the new page you fetch will not be served up from cache some place.

That's not the problem of my solution. :) It should have access to two copies of the site from different times to compare them. :) Validating that content was not supplied from the cache or something, is either user's headache or yet another addon to the script. :)

It would also be a problem if the overall content of the page was the same, but something like the <date> was different every day. Of course, this can be argued both ways, but one must assume that changed is subjective and not objective.

Well, that was one of the reasons I suggested the use of Text::Diff from the very beginning, since it will minimize the headache. You'll be able to quickly grep away things like dates. :)

I would probably roll my own very much like you have suggested. Since the number of pages to track could get large, I would probably store the MD5 sum and the URL in a database and that's it.

You could always start away with the hash like:

my %internet = ( 'url' => 'md5 checksum', );

Thanks for the feedback anyway. :)

Leonid Mamtchenkov aka TVSET

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://259959]
and all is quiet...

How do I use this? | Other CB clients
Other Users?
Others chanting in the Monastery: (4)
As of 2017-12-17 02:51 GMT
Find Nodes?
    Voting Booth?
    What programming language do you hate the most?

    Results (462 votes). Check out past polls.