Beefy Boxes and Bandwidth Generously Provided by pair Networks
laziness, impatience, and hubris
 
PerlMonks  

Re^3: Normalizing URLs

by spiritway (Vicar)
on Jul 22, 2005 at 03:18 UTC ( #477087=note: print w/replies, xml ) Need Help??


in reply to Re^2: Normalizing URLs
in thread Normalizing URLs

What about using the "last_modified" method in LWP? Keep track of it locally. When you access the page again, check the time it was modified and skip it if that time is not newer than what you've saved.

This idea is from "Spidering Hacks" (hack #16).

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://477087]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others wandering the Monastery: (8)
As of 2020-05-29 20:08 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    If programming languages were movie genres, Perl would be:















    Results (170 votes). Check out past polls.

    Notices?