Malach
Note: I'm not saying that you're wrong. At all.

I'd be inclined to take a different tack on this.

Script to check the last modfied date/time on external links, and if changed since last checked give the list for a human to check.

Of course, there are issues with getting the last modified accurately, but I imagine that they're more solvable than parsing for content.

Perhaps each page has a certain string you can check for to make sure it's unchanged?

Hope the different viewpoint helps.



    That's another good idea, but most (if not all) of our external links will be updated quite often. We just don't have the manpower to check every link every time it is updated.

    Besides... why have people do the work that a few well-planned regexps can do? :) Thanks, though!
      Well... you could however have your script only check those pages that changed... and as a safety net check pages that haven't reported changes on a less frequent basis... to make it so your script doesn't run forever.

