Beefy Boxes and Bandwidth Generously Provided by pair Networks
Syntactic Confectionery Delight
 
PerlMonks  

Re: Re: Checking for an existing URL

by rjimlad (Acolyte)
on Sep 29, 2002 at 14:57 UTC ( #201550=note: print w/ replies, xml ) Need Help??


in reply to Re: Checking for an existing URL
in thread Checking for an existing URL

Or if you are just after 404s:

... if ($response->code()==404) { ...

...which is valid because while the payload of a 404, or in fact the text message in the status header could be anything (eg they could be localised) the first three non-space characters in the Status: HTTP header must be the HTTP response code.

Unfortunately, there is a flaw in any such approach. If you want to probe for the existence of a file (or listening script), you may for example have DNS problems, a bad (or unusable) URI scheme part, a faulty proxy or redirector, problems connecting to the IP, random server problems, not to mention the possibility of a CGI script that sends a 404 response on purpose (necessary for most properly-operating error handler scripts).

And the other side of it is that you can get 'false' positives from, eg, apache 'handlers', errordocuments and the like, including badly-operating error handlers.

In short, there's no easy way to do so. Best option, IMO, would be to use $request->is_success(), as mentioned (implicitly) in the message preceding this, to mark 'valid' URLs and consider anything else to be undefined.


Comment on Re: Re: Checking for an existing URL
Download Code

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://201550]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others perusing the Monastery: (11)
As of 2014-12-22 22:45 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    Is guessing a good strategy for surviving in the IT business?





    Results (132 votes), past polls