Beefy Boxes and Bandwidth Generously Provided by pair Networks
Just another Perl shrine

Re: Re: Checking for an existing URL

by rjimlad (Acolyte)
on Sep 29, 2002 at 14:57 UTC ( #201550=note: print w/replies, xml ) Need Help??

in reply to Re: Checking for an existing URL
in thread Checking for an existing URL

Or if you are just after 404s:

... if ($response->code()==404) { ...

...which is valid because while the payload of a 404, or in fact the text message in the status header could be anything (eg they could be localised) the first three non-space characters in the Status: HTTP header must be the HTTP response code.

Unfortunately, there is a flaw in any such approach. If you want to probe for the existence of a file (or listening script), you may for example have DNS problems, a bad (or unusable) URI scheme part, a faulty proxy or redirector, problems connecting to the IP, random server problems, not to mention the possibility of a CGI script that sends a 404 response on purpose (necessary for most properly-operating error handler scripts).

And the other side of it is that you can get 'false' positives from, eg, apache 'handlers', errordocuments and the like, including badly-operating error handlers.

In short, there's no easy way to do so. Best option, IMO, would be to use $request->is_success(), as mentioned (implicitly) in the message preceding this, to mark 'valid' URLs and consider anything else to be undefined.

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://201550]
[ambrus]: Corion: ah, so you want a library that parses HTTP, and you want to do the IO yourself, and don't want a full AnyEvent wrapper.
[ambrus]: Corion: I think I parsed a HTTP header from a string with LWP once. You can definitely use that to create a HTTP message too. The problme is
[ambrus]: that if you do that, you'd have to find where each HTTP response ends, which is nontrivial if you want persistent connections (essential for performance if you have small requests).
[Corion]: ambrus: Yes, ideally an API that I feed the incoming data piece by piece and that I can ask "is that response done" and "what should I do next" and "please construct the appropriate redirect for me"
[Corion]: ambrus: Yes, ideally the module would do all that nasty stuff for me and give me a way to ask it what the current situation is
[ambrus]: Corion: you could also consider using some wrapper over the multi interface of curl, I think Net::Curl might be a good one, since implementing enough of what it expects from the event loop might be easier than a full AnyEvent interface.
[ambrus]: Corion: you could also consider using IO::Async and its POE driver and some HTTP module for it, but I don't know if that would solve your problems with AnyEvent+POE
[ambrus]: Corion: wait, you didn't say POE. You said Prima, let me look up what that is.
[ambrus]: Corion: have you considered just writing an AnyEvent integration for that thing?
[ambrus]: Or perhaps pushing schmorp to write one?

How do I use this? | Other CB clients
Other Users?
Others imbibing at the Monastery: (16)
As of 2016-12-07 16:04 GMT
Find Nodes?
    Voting Booth?
    On a regular basis, I'm most likely to spy upon:

    Results (130 votes). Check out past polls.