This a lot easier than you think ...
my $ua = LWP::UserAgent->new;
my $request = HTTP::Request->new('GET' => $url);
my $response = $ua->request($request)
if ($response->is_error) {
...
} else {
...
}
Alternatively, you could employ the is_success method for testing for successful retrieval of the passed URL. Furthermore, the actual numeric response code received can be returned with the code method. For further details on HTTP::Response methods, see the HTTP::Response man page.
perl -e 'print+unpack("N",pack("B32","00000000000000000000000111000011")),"\n"'
| [reply] [Watch: Dir/Any] [d/l] [select] |
...
if ($response->code()==404) {
...
...which is valid because while the payload of a 404, or in fact the text message in the status header could be anything (eg they could be localised) the first three non-space characters in the Status: HTTP header must be the HTTP response code.
Unfortunately, there is a flaw in any such approach. If you want to probe for the existence of a file (or listening script), you may for example have DNS problems, a bad (or unusable) URI scheme part, a faulty proxy or redirector, problems connecting to the IP, random server problems, not to mention the possibility of a CGI script that sends a 404 response on purpose (necessary for most properly-operating error handler scripts).
And the other side of it is that you can get 'false' positives from, eg, apache 'handlers', errordocuments and the like, including badly-operating error handlers.
In short, there's no easy way to do so. Best option, IMO, would be to use $request->is_success(), as mentioned (implicitly) in the message preceding this, to mark 'valid' URLs and consider anything else to be undefined. | [reply] [Watch: Dir/Any] [d/l] |
I thing you should try HEAD instad of GET. If all you want to know is whether the URL is fine, why would you download the whole document? :-)
Another thing. $response->code() returns the HTTP status code. So you do not have to search for anything in the contents. Besides ... it's quite possible that the HTML with the "URL not found" message will NOT contain any 404 at all.
Jenda
| [reply] [Watch: Dir/Any] [d/l] |
| [reply] [Watch: Dir/Any] |
It is very rare, but merlyn speaks the truth, some webserver's are broken.
There is still no reason to download the whole document.
Enjoy the fruits of those who RTFM :) LWP head replacement
poetry ;)
update:
I don't mean it's rare that merlyn speaks the truth, I mean it is rare that a webserver is broken in such a manner, where a HEAD request would fail like so ;)
____________________________________________________ ** The Third rule of perl club is a statement of fact: pod is sexy.
| [reply] [Watch: Dir/Any] |