Beefy Boxes and Bandwidth Generously Provided by pair Networks
Just another Perl shrine
 
PerlMonks  

LWP issuing two CONNECT requests when only one is asked for?

by mickey (Acolyte)
on Apr 04, 2005 at 16:15 UTC ( #444729=perlquestion: print w/ replies, xml ) Need Help??
mickey has asked for the wisdom of the Perl Monks concerning the following question:

Monks,

I'm trying to access an HTTPS website from behind a proxy server using LWP::UserAgent and Crypt::SSLeay.

My test script looks like this:

#!/usr/bin/perl use strict; use warnings; use LWP::UserAgent; my $url = $ARGV[0]; my $proxy_ip = 'https-proxy.example.com'; my $proxy_port = 80; $ENV{'HTTPS_PROXY'} = "$proxy_ip:$proxy_port"; $ENV{'HTTPS_DEBUG'} = 1; my $ua = LWP::UserAgent->new; print "Connecting to $url ...\n"; my $response = $ua->get( $url ); $response->is_success or die "Failed to GET '$url': ", $response->status_line; print $response->as_string;

This returns a "403 Forbidden" error from the proxy. When I watch the transaction with Ethereal, though, I get the following:

CONNECT remote.server.com:443 HTTP/1.0 HTTP/1.0 200 Connection established CONNECT https-proxy.example.com:80 HTTP/1.0 HTTP/1.0 403 Forbidden

It looks like the first CONNECT request is working fine, but LWP::UserAgent is actually sending two CONNECT requests when I've only asked it to send one, and the second request is failing because it's trying to connect directly to the proxy rather than to the remote server.

Any thoughts on why this would be happening?

Thanks.

Comment on LWP issuing two CONNECT requests when only one is asked for?
Select or Download Code
Re: LWP issuing two CONNECT requests when only one is asked for?
by mpeters (Chaplain) on Apr 04, 2005 at 17:22 UTC
    It looks like the site in question is sending you a redirect and LWP is following that redirect. The 403 isn't a error on the LWP side since the server is actually sending that.

      But the reply from the server to the initial CONNECT is 200 OK, not 302 Found or some other redirect code -- I don't think LWP would interpret that as a redirect, would it?

      Also, the 403 seems to be coming from the proxy rather than the remote server. Obviously all the traffic is coming from the proxy, so I can't tell that way, but because the preceding CONNECT is to the proxy server rather than the remote server I would guess the response comes from the proxy as well.

      And the text of the HTML page returned along with the 403 response seems to indicate it's coming from our proxy, unless the remote site is running the same type of web proxy.

      So I'm a little doubtful of your answer, although I appreciate your help!

Re: LWP issuing two CONNECT requests when only one is asked for?
by davido (Archbishop) on Apr 04, 2005 at 18:05 UTC

    It looks tome like LWP::UserAgent is asking for the remote server, obtaining a connection to the proxy, and then the proxy is failing to gain access to the remote server (Forbidden). The error messages are a little convoluted, but it seems to me that UserAgent reports "connection established" when it gets a response from the proxy. And the first message simply is oblivious to the fact that there is a proxy mediating the connection. Then the second response is the one the proxy obtains when it attemps to perform the mediation.


    Dave

      I see what you're getting at, but I don't understand why the second CONNECT request is asking for the proxy server rather than the remote server.

      Is that part of the protocol?

        Just confirmed my own theory, i.e. that the second CONNECT isn't part of the protocol and isn't sent by other HTTPS clients. Requesting the same HTTPS server with Internet Explorer and watching the HTTP traffic with Etherial I only see one CONNECT, the first one to the specified remote server.

        So it does seem like one of the underlying modules that LWP::UserAgent uses to connect to HTTPS websites is doing something out of the ordinary.

        I'll try cutting LWP::UserAgent out of the loop and gradually getting more low-level until I find the point it's going wrong, but if anyone has any suggestions as to where I might start looking that would be great!

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://444729]
Approved by Old_Gray_Bear
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others cooling their heels in the Monastery: (8)
As of 2014-10-21 21:56 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    For retirement, I am banking on:










    Results (111 votes), past polls