Beefy Boxes and Bandwidth Generously Provided by pair Networks
Welcome to the Monastery
 
PerlMonks  

LWP::UserAgent subclass to make it follow redirects after POST (like Netscape)

by gregorovius (Friar)
on Feb 26, 2002 at 16:09 UTC ( [id://147608]=sourcecode: print w/replies, xml ) Need Help??
Category: Web Stuff
Author/Contact Info Claudio Garcia (gregorovius) (claudio.garcia@stanfordalumni.org)
Description: A subclass of LWP::UserAgent that replicates Netscape's behavior on redirects after a POST request (ie. it will follow POST redirects but it will turn them into GETs before doing so ). I believe Microsoft's IE behaves like this as well.

A lot of web applications rely on this non-standard behavior in browsers so I think it would be a good idea to integrate this to LWP. See Redirect after POST behavior in LWP::UserAgent differs from Netscape for reference.

Look for the XXX marker in the code to see where this code differs from the one in LWP::UserAgent.

package LWP::NetscapeLikeUserAgent;

use LWP::UserAgent;

use vars qw(@ISA);

@ISA = qw(LWP::UserAgent);


#
# request
#
# Substitutes the default (and standard compliant) "request" method of
+ the
# LWP::UserAgent module for one that behaves like the Netscape browser
+, ie.
# it will follow redirects after POST requests.
#
sub request
{
    my($self, $request, $arg, $size, $previous) = @_;

    LWP::Debug::trace('()');

    my $response = $self->simple_request($request, $arg, $size);

    my $code = $response->code;
    $response->previous($previous) if defined $previous;

    LWP::Debug::debug('Simple response: ' .
              (HTTP::Status::status_message($code) ||
               "Unknown code $code"));

    if ($code == &HTTP::Status::RC_MOVED_PERMANENTLY or
    $code == &HTTP::Status::RC_MOVED_TEMPORARILY) {

    # Make a copy of the request and initialize it with the new URI
    my $referral = $request->clone;

    # XXX This is where this routine differs from the one in LWP::User
+Agent.
    # If the method of the request is POST then we make the method of 
+the
    # new request be GET, so that it passes the redirect_ok test. This
+ is 
    # non-standard but is the way Netscape does it, and many web appli
+cations
    # rely on this.
    if($request->method eq 'POST') {
        $referral->method('GET');
        $referral->content('');
    }

    # And then we update the URL based on the Location:-header.
    my $referral_uri = $response->header('Location');
    {
        # Some servers erroneously return a relative URL for redirects
+,
        # so make it absolute if it not already is.
        local $URI::ABS_ALLOW_RELATIVE_SCHEME = 1;
        my $base = $response->base;
        $referral_uri = $HTTP::URI_CLASS->new($referral_uri, $base)
                    ->abs($base);
    }

    $referral->url($referral_uri);

    return $response unless $self->redirect_ok($referral);

    # Check for loop in the redirects
    my $count = 0;
    my $r = $response;
    while ($r) {
        if (++$count > 13 ||
                $r->request->url->as_string eq $referral_uri->as_strin
+g) {
        $response->header("Client-Warning" =>
                  "Redirect loop detected");
        return $response;
        }
        $r = $r->previous;
    }

    return $self->request($referral, $arg, $size, $response);

    } elsif ($code == &HTTP::Status::RC_UNAUTHORIZED ||
         $code == &HTTP::Status::RC_PROXY_AUTHENTICATION_REQUIRED
        )
    {
    my $proxy = ($code == &HTTP::Status::RC_PROXY_AUTHENTICATION_REQUI
+RED);
    my $ch_header = $proxy ?  "Proxy-Authenticate" : "WWW-Authenticate
+";
    my @challenge = $response->header($ch_header);
    unless (@challenge) {
        $response->header("Client-Warning" => 
                  "Missing Authenticate header");
        return $response;
    }

    require HTTP::Headers::Util;
    CHALLENGE: for my $challenge (@challenge) {
        $challenge =~ tr/,/;/;  # "," is used to separate auth-params!
+!
        ($challenge) = HTTP::Headers::Util::split_header_words($challe
+nge);
        my $scheme = lc(shift(@$challenge));
        shift(@$challenge); # no value
        $challenge = { @$challenge };  # make rest into a hash
        for (keys %$challenge) {       # make sure all keys are lower 
+case
        $challenge->{lc $_} = delete $challenge->{$_};
        }

        unless ($scheme =~ /^([a-z]+(?:-[a-z]+)*)$/) {
        $response->header("Client-Warning" => 
                  "Bad authentication scheme '$scheme'");
        return $response;
        }
        $scheme = $1;  # untainted now
        my $class = "LWP::Authen::\u$scheme";
        $class =~ s/-/_/g;
    
        no strict 'refs';
        unless (%{"$class\::"}) {
        # try to load it
        eval "require $class";
        if ($@) {
            if ($@ =~ /^Can\'t locate/) {
            $response->header("Client-Warning" =>
                      "Unsupported authentication scheme '$scheme'");
            } else {
            $response->header("Client-Warning" => $@);
            }
            next CHALLENGE;
        }
        }
        return $class->authenticate($self, $proxy, $challenge, $respon
+se,
                    $request, $arg, $size);
    }
    return $response;
    }
    return $response;
}

sub redirect_ok
{
    1;
}

1;
  • Comment on LWP::UserAgent subclass to make it follow redirects after POST (like Netscape)
  • Download Code
Replies are listed 'Best First'.
•Re: LWP::UserAgent subclass to make it follow redirects after POST (like Netscape)
by merlyn (Sage) on Feb 26, 2002 at 16:15 UTC
    This is much simpler:
    use LWP::UserAgent; sub LWP::UserAgent::redirect_ok { my ($self, $request) = @_; $request->method("GET"),$request->content("") if $request->method eq + "POST"; 1; } ...
    You can subclass it if you wish, but just "patching" it fixes program-wide behavior.

    -- Randal L. Schwartz, Perl hacker


    updated code based on feedback.
      Doing that is not enough. The POST request must be converted into a GET request by the agent and its content removed. That's Netscape's exact behavior, as far as I remember (I wrote this code about a year ago). This snippet in my code does it:
      if($request->method eq 'POST') { $referral->method('GET'); $referral->content(''); }
      Last time I checked UserAgent was not doing this as a result of overriding redirect_ok().

      Update: This is a reply to Merlyn's original post, in which redirect_ok simply returned a 1.

      The latest LWP::UserAgent allows the behavior of redirect_ok to be overridden using the requests_redirectable method.
      $ua->requests_redirectable( ); # to read $ua->requests_redirectable( \@requests ); # to set This reads or sets the object's list of request names that $ua->redirect_ok(...) will allow redirection for. By default, this is ['GET', 'HEAD'], as per RFC 2068. To change to include 'POST', consider: push @{ $ua->requests_redirectable }, 'POST'; $ua->redirect_ok($prospective_request) This method is called by request() before it tries to follow a redirection to the request in $prospective_request. This should return a true value if this redirection is permissible. The default implementation will return FALSE unless the method is in the object's requests_redirectable list, FALSE if the proposed redirection is to a "file://..." URL, and TRUE otherwise. Subclasses might want to override this. (This method's behavior in previous versions was simply to return TRUE for anything except POST requests).
¶Re: •Re: LWP::UserAgent subclass to make it follow redirects after POST (like Netscape)
by Anonymous Monk on Feb 27, 2002 at 00:59 UTC
    The latest LWP::UserAgent allows the behavior of redirect_ok to be overridden using the requests_redirectable method.
    $ua->requests_redirectable( ); # to read $ua->requests_redirectable( \@requests ); # to set This reads or sets the object's list of request names that $ua->redirect_ok(...) will allow redirection for. By default, this is ['GET', 'HEAD'], as per RFC 2068. To change to include 'POST', consider: push @{ $ua->requests_redirectable }, 'POST'; $ua->redirect_ok($prospective_request) This method is called by request() before it tries to follow a redirection to the request in $prospective_request. This should return a true value if this redirection is permissible. The default implementation will return FALSE unless the method is in the object's requests_redirectable list, FALSE if the proposed redirection is to a "file://..." URL, and TRUE otherwise. Subclasses might want to override this. (This method's behavior in previous versions was simply to return TRUE for anything except POST requests).

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: sourcecode [id://147608]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others musing on the Monastery: (6)
As of 2024-04-23 20:04 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found