Beefy Boxes and Bandwidth Generously Provided by pair Networks
more useful options
 
PerlMonks  

Re^2: GET request using LWP::UserAgent returns 200 OK but Firefox 302 Found

by bliako (Monsignor)
on Mar 11, 2018 at 01:23 UTC ( #1210640=note: print w/replies, xml ) Need Help??


in reply to Re: GET request using LWP::UserAgent returns 200 OK but Firefox 302 Found
in thread GET request using LWP::UserAgent returns 200 OK but Firefox 302 Found

Good thinking! Indeed there are other GETs before the ones I described (in a previous phase which completes successfully) so the cookie actually is set. thanks.

I think I am getting closer though but I still have to test it:

after using

Wireshark as per 7stud's and haukex's advice

and

LWP::ConsoleLogger::Easy

I realised that LWP::UserAgent could be responding to a '302 Found' automatically and follow the redirect. And that BOTH me (via LWP) and LWP (responding to the 302 automatically) are sending another request to the re-location (however, I am sending one after LWP finished with his). And that messes things up.

Man page of LWP states that there is a list called 'requests_redirectable' which contains the protocols for which to follow redirects. By default, 'GET' and 'HEAD' are included. POST is not.

Given also that LWP's 'max_redirect' is 7 by default, it sounds to me that a GET returning with a 302 will cause LWP to follow automatically. But I am also doing that myself in the program having assumed that LWP will not follow redirects (or forgottent that it does).

In my 'scraping exercise' there is a long list of previous POSTs which return a 302 but this is the first time GET does. The POSTs were not followed on by LWP and all was OK but the GET is (because it is in the 'requests_redirectable' list of LWP) and the problem arose.

thanks
  • Comment on Re^2: GET request using LWP::UserAgent returns 200 OK but Firefox 302 Found

Replies are listed 'Best First'.
Re^3: GET request using LWP::UserAgent returns 200 OK but Firefox 302 Found
by bliako (Monsignor) on Mar 20, 2018 at 17:35 UTC

    I can now say that the problem indeed is that LWP was following redirects (as it should). But also myself was also following redirects by issuing another request via LWP.

    So how I solved it was to set

    $ua->requests_redirectable([]);
    which tells LWP::UserAgent not to follow any redirects for any request.

    (setting

    $ua->requests_redirectable(['GET']);
    would allow only GET to be followed by LWP).

    I have also discovered that there is another problem with allowing UA to follow redirects. In a redirect the server sends a Location header which contains the url of the redirect and issues a 302 status (or 30X something). UA extracts this Location url and issues another request to there.

    The problem lies in the server sometimes sending a relative url back. And UA tries to make it absolute. In my case, UA failed to do that. So even if I allowed UA to follow redirect, it would have failed in sending a malformed url to the server.

    UA has the following code to convert the url:

    my $referral_uri = $response->header('Location'); { # Some servers erroneously return a relative URL for redir +ects, # so make it absolute if it not already is. local $URI::ABS_ALLOW_RELATIVE_SCHEME = 1; my $base = $response->base; $referral_uri = "" unless defined $referral_uri; $referral_uri = $HTTP::URI_CLASS->new($referral_uri, $base)->abs($ba +se); } $referral->uri($referral_uri);

    In my case:

    base='http://server.com/ABC/afilename1?op=678' referral='../../ABC/XYZ/KLM/afilename2?aa=123'
    and the calculated new referral came out as:
    http://server.com/../ABC/XYZ/KLM/afilename2?aa=123

    instead of the correct one of:

    http://server.com/ABC/XYZ/KLM/afilename2?aa=123

    may be this is expected behaviour from URI->abs()?< I will send a bug report just in case.

    Thanks Monks

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://1210640]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others having an uproarious good time at the Monastery: (7)
As of 2022-12-05 20:24 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found

    Notices?