Beefy Boxes and Bandwidth Generously Provided by pair Networks DiBona
laziness, impatience, and hubris
 
PerlMonks  

Re: Checking "incomplete" URLs

by BlueLines (Hermit)
on Feb 18, 2002 at 21:53 UTC ( [id://146299]=note: print w/replies, xml ) Need Help??

This is an archived low-energy page for bots and other anonmyous visitors. Please sign up if you are a human and want to interact.


in reply to Checking "incomplete" URLs

My question is how do I get LWP useragent to act like a browser and find the default page in a directory?

It has nothing to do with your browser, and everything to do with your web server. I tested your example on a site I had control of (running apache). Here's what happened:
[jon@valium jon]$ telnet divisionbyzero.com 80 Trying 168.103.109.84... Connected to divisionbyzero.com. Escape character is '^]'. GET /decss HTTP/1.0 HTTP/1.1 301 Moved Permanently Date: Tue, 19 Feb 2002 02:47:50 GMT Server: Apache/1.3.22 (Unix) (Red-Hat/Linux) mod_ssl/2.8.5 OpenSSL/0. +9.6b mod_perl/1.24_01 Location: http://www.divisionbyzero.com/decss/ Connection: close Content-Type: text/html; charset=iso-8859-1 <!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN"> <HTML><HEAD> <TITLE>301 Moved Permanently</TITLE> </HEAD><BODY> <H1>Moved Permanently</H1> The document has moved <A HREF="http://www.divisionbyzero.com/decss/"> +here</A>.<P> <HR> <ADDRESS>Apache/1.3.22 Server at www.divisionbyzero.com Port 80</ADDRE +SS> </BODY></HTML> Connection closed by foreign host.
The web server sent me a 301 since /decss wasn't an actual file, but rather, a directory. My web browser followed that redirect automatically, which is what browsers are supposed to do when the http method used is GET or HEAD. I suspect your troubles are caused because you are using the POST method, which is explicitly forbidden to redirect you without notifying the user.

BlueLines

Disclaimer: This post may contain inaccurate information, be habit forming, cause atomic warfare between peaceful countries, speed up male pattern baldness, interfere with your cable reception, exile you from certain third world countries, ruin your marriage, and generally spoil your day. No batteries included, no strings attached, your mileage may vary.

Replies are listed 'Best First'.
Re: Re: Checking "incomplete" URLs
by nop (Hermit) on Feb 18, 2002 at 22:08 UTC
    Hurrah! GET (vs. POST) solved it -- Many thanks, BlueLines! ++
    sub validURL { my ($self, $url) = @_; my $req = new HTTP::Request GET => $url; my $res = $self->request($req); my $content = $res->content; return 0 if $content =~ /the page you have requested cannot be fou +nd/i; return 0 unless $content =~ /\S/i; return 1; }
Re: Re: Checking "incomplete" URLs
by chipmunk (Parson) on Feb 18, 2002 at 23:29 UTC
    By default, LWP::UserAgent automatically follows redirects for any request except a POST. The redirect_ok() method controls this behavior:
    $ua->redirect_ok This method is called by request() before it tries to do any redirects. It should return a true value if a redirect is allowed to be performed. Subclasses might want to override this. The default implementation will return FALSE for POST request and TRUE for all others.
    Recently I had to write a script which posted a form on a remote site, and then checked the text of the resulting page to make sure the post succeeded. Unfortunately, there was a redirect to that page.

    First I tried a making a subclass with a new redirect_ok() that always returned 1. Unfortunately, LWP::UserAgent used a POST request for the redirect; the remote server returned a 405 error. I ended up writing a redirect_ok() which replaced the POST request object in @_ with a new one that did a GET instead. Ugly, but it worked!

      You could upgrade to latest libwww and just use method requests_redirectable from LWP::UserAgent
      $ua->requests_redirectable( ); # to read $ua->requests_redirectable( \@requests ); # to set This reads or sets the object's list of request names that "$ua->redirect_ok(...)" will allow redirection for. By default, this is "['GET', 'HEAD']", as per RFC 2068. To change to include 'POST', consider: push @{ $ua->requests_redirectable }, 'POST';

      --
      Ilya Martynov (http://martynov.org/)

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://146299]
help
Sections?
Information?
Find Nodes?
Leftovers?
    Notices?
    hippoepoptai's answer Re: how do I set a cookie and redirect was blessed by hippo!
    erzuuliAnonymous Monks are no longer allowed to use Super Search, due to an excessive use of this resource by robots.