Beefy Boxes and Bandwidth Generously Provided by pair Networks
No such thing as a small change
 
PerlMonks  

LWP::UserAgent subclass to make it follow redirects after POST (like Netscape)

by gregorovius (Friar)
on Feb 26, 2002 at 16:09 UTC ( #147608=sourcecode: print w/ replies, xml ) Need Help??

Category: Web Stuff
Author/Contact Info Claudio Garcia (gregorovius) (claudio.garcia@stanfordalumni.org)
Description: A subclass of LWP::UserAgent that replicates Netscape's behavior on redirects after a POST request (ie. it will follow POST redirects but it will turn them into GETs before doing so ). I believe Microsoft's IE behaves like this as well.

A lot of web applications rely on this non-standard behavior in browsers so I think it would be a good idea to integrate this to LWP. See Redirect after POST behavior in LWP::UserAgent differs from Netscape for reference.

Look for the XXX marker in the code to see where this code differs from the one in LWP::UserAgent.

package LWP::NetscapeLikeUserAgent;

use LWP::UserAgent;

use vars qw(@ISA);

@ISA = qw(LWP::UserAgent);


#
# request
#
# Substitutes the default (and standard compliant) "request" method of
+ the
# LWP::UserAgent module for one that behaves like the Netscape browser
+, ie.
# it will follow redirects after POST requests.
#
sub request
{
    my($self, $request, $arg, $size, $previous) = @_;

    LWP::Debug::trace('()');

    my $response = $self->simple_request($request, $arg, $size);

    my $code = $response->code;
    $response->previous($previous) if defined $previous;

    LWP::Debug::debug('Simple response: ' .
              (HTTP::Status::status_message($code) ||
               "Unknown code $code"));

    if ($code == &HTTP::Status::RC_MOVED_PERMANENTLY or
    $code == &HTTP::Status::RC_MOVED_TEMPORARILY) {

    # Make a copy of the request and initialize it with the new URI
    my $referral = $request->clone;

    # XXX This is where this routine differs from the one in LWP::User
+Agent.
    # If the method of the request is POST then we make the method of 
+the
    # new request be GET, so that it passes the redirect_ok test. This
+ is 
    # non-standard but is the way Netscape does it, and many web appli
+cations
    # rely on this.
    if($request->method eq 'POST') {
        $referral->method('GET');
        $referral->content('');
    }

    # And then we update the URL based on the Location:-header.
    my $referral_uri = $response->header('Location');
    {
        # Some servers erroneously return a relative URL for redirects
+,
        # so make it absolute if it not already is.
        local $URI::ABS_ALLOW_RELATIVE_SCHEME = 1;
        my $base = $response->base;
        $referral_uri = $HTTP::URI_CLASS->new($referral_uri, $base)
                    ->abs($base);
    }

    $referral->url($referral_uri);

    return $response unless $self->redirect_ok($referral);

    # Check for loop in the redirects
    my $count = 0;
    my $r = $response;
    while ($r) {
        if (++$count > 13 ||
                $r->request->url->as_string eq $referral_uri->as_strin
+g) {
        $response->header("Client-Warning" =>
                  "Redirect loop detected");
        return $response;
        }
        $r = $r->previous;
    }

    return $self->request($referral, $arg, $size, $response);

    } elsif ($code == &HTTP::Status::RC_UNAUTHORIZED ||
         $code == &HTTP::Status::RC_PROXY_AUTHENTICATION_REQUIRED
        )
    {
    my $proxy = ($code == &HTTP::Status::RC_PROXY_AUTHENTICATION_REQUI
+RED);
    my $ch_header = $proxy ?  "Proxy-Authenticate" : "WWW-Authenticate
+";
    my @challenge = $response->header($ch_header);
    unless (@challenge) {
        $response->header("Client-Warning" => 
                  "Missing Authenticate header");
        return $response;
    }

    require HTTP::Headers::Util;
    CHALLENGE: for my $challenge (@challenge) {
        $challenge =~ tr/,/;/;  # "," is used to separate auth-params!
+!
        ($challenge) = HTTP::Headers::Util::split_header_words($challe
+nge);
        my $scheme = lc(shift(@$challenge));
        shift(@$challenge); # no value
        $challenge = { @$challenge };  # make rest into a hash
        for (keys %$challenge) {       # make sure all keys are lower 
+case
        $challenge->{lc $_} = delete $challenge->{$_};
        }

        unless ($scheme =~ /^([a-z]+(?:-[a-z]+)*)$/) {
        $response->header("Client-Warning" => 
                  "Bad authentication scheme '$scheme'");
        return $response;
        }
        $scheme = $1;  # untainted now
        my $class = "LWP::Authen::\u$scheme";
        $class =~ s/-/_/g;
    
        no strict 'refs';
        unless (%{"$class\::"}) {
        # try to load it
        eval "require $class";
        if ($@) {
            if ($@ =~ /^Can\'t locate/) {
            $response->header("Client-Warning" =>
                      "Unsupported authentication scheme '$scheme'");
            } else {
            $response->header("Client-Warning" => $@);
            }
            next CHALLENGE;
        }
        }
        return $class->authenticate($self, $proxy, $challenge, $respon
+se,
                    $request, $arg, $size);
    }
    return $response;
    }
    return $response;
}

sub redirect_ok
{
    1;
}

1;

Comment on LWP::UserAgent subclass to make it follow redirects after POST (like Netscape)
Download Code
•Re: LWP::UserAgent subclass to make it follow redirects after POST (like Netscape)
by merlyn (Sage) on Feb 26, 2002 at 16:15 UTC
    This is much simpler:
    use LWP::UserAgent; sub LWP::UserAgent::redirect_ok { my ($self, $request) = @_; $request->method("GET"),$request->content("") if $request->method eq + "POST"; 1; } ...
    You can subclass it if you wish, but just "patching" it fixes program-wide behavior.

    -- Randal L. Schwartz, Perl hacker


    updated code based on feedback.
      Doing that is not enough. The POST request must be converted into a GET request by the agent and its content removed. That's Netscape's exact behavior, as far as I remember (I wrote this code about a year ago). This snippet in my code does it:
      if($request->method eq 'POST') { $referral->method('GET'); $referral->content(''); }
      Last time I checked UserAgent was not doing this as a result of overriding redirect_ok().

      Update: This is a reply to Merlyn's original post, in which redirect_ok simply returned a 1.

      The latest LWP::UserAgent allows the behavior of redirect_ok to be overridden using the requests_redirectable method.
      $ua->requests_redirectable( ); # to read $ua->requests_redirectable( \@requests ); # to set This reads or sets the object's list of request names that $ua->redirect_ok(...) will allow redirection for. By default, this is ['GET', 'HEAD'], as per RFC 2068. To change to include 'POST', consider: push @{ $ua->requests_redirectable }, 'POST'; $ua->redirect_ok($prospective_request) This method is called by request() before it tries to follow a redirection to the request in $prospective_request. This should return a true value if this redirection is permissible. The default implementation will return FALSE unless the method is in the object's requests_redirectable list, FALSE if the proposed redirection is to a "file://..." URL, and TRUE otherwise. Subclasses might want to override this. (This method's behavior in previous versions was simply to return TRUE for anything except POST requests).
¶Re: •Re: LWP::UserAgent subclass to make it follow redirects after POST (like Netscape)
by Anonymous Monk on Feb 27, 2002 at 00:59 UTC
    The latest LWP::UserAgent allows the behavior of redirect_ok to be overridden using the requests_redirectable method.
    $ua->requests_redirectable( ); # to read $ua->requests_redirectable( \@requests ); # to set This reads or sets the object's list of request names that $ua->redirect_ok(...) will allow redirection for. By default, this is ['GET', 'HEAD'], as per RFC 2068. To change to include 'POST', consider: push @{ $ua->requests_redirectable }, 'POST'; $ua->redirect_ok($prospective_request) This method is called by request() before it tries to follow a redirection to the request in $prospective_request. This should return a true value if this redirection is permissible. The default implementation will return FALSE unless the method is in the object's requests_redirectable list, FALSE if the proposed redirection is to a "file://..." URL, and TRUE otherwise. Subclasses might want to override this. (This method's behavior in previous versions was simply to return TRUE for anything except POST requests).

Back to Code Catacombs

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: sourcecode [id://147608]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others wandering the Monastery: (4)
As of 2014-07-24 04:53 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    My favorite superfluous repetitious redundant duplicative phrase is:









    Results (157 votes), past polls