Beefy Boxes and Bandwidth Generously Provided by pair Networks
good chemistry is complicated,
and a little bit messy -LW
 
PerlMonks  

User agent through Privoxy?

by neilwatson (Curate)
on Jun 17, 2006 at 18:03 UTC ( #555996=perlquestion: print w/ replies, xml ) Need Help??
neilwatson has asked for the wisdom of the Perl Monks concerning the following question:

How does one go about using a user agent like lWP::UserAgent through Privoxy? I believe that Priovxy works like a SOCKs proxy. The user agent seems to ignore the proxy and connects directly. Is there another agent I can use that will work?
# Set agent and proxy $ua = LWP::UserAgent->new; $ua->proxy('socks', "http://localhost:8118"); # remove cookies unlink "/tmp/cookies.txt"; $ua->cookie_jar({ file => "/tmp/cookies.txt" }); $response = $ua->get($url); if ($response->is_success){ print $response->content; }else { print $response->status_line; }

Neil Watson
watson-wilson.ca

Comment on User agent through Privoxy?
Download Code
Re: User agent through Privoxy?
by ioannis (Priest) on Jun 17, 2006 at 19:22 UTC
    It does not work for me either; but proxy works when I use $ua->env_proxy . The env_proxy paramameter to LWP::UserAgent->new( env_proxy=>1) is also working fine.
Re: User agent through Privoxy?
by ioannis (Priest) on Jun 17, 2006 at 21:39 UTC
    Here is the sample code:
    use LWP::UserAgent; $ua = LWP::UserAgent->new; $ENV{ http_proxy } ='http://localhost:80'; $ua->env_proxy; print $ua->get( 'http://www.google.com' )->as_string;
    With LogLevel set to 'debug', forward proxing is confirmed from error.log of my local server:
    [Sat Jun 17 17:30:57 2006] [debug] proxy_http.c(630): Content-Type: te +xt/html
      Produces an error for me:500 Chunked must be last Transfer-Encoding 'identity'

      Neil Watson
      watson-wilson.ca

        Hi Neil,
        I am having the very same problem. I was trying to run WWW::Mechanize through privoxy, which in turn was forwarding everything to Tor so I could run my scripts anonymously. I set everything up on my Linux machine and was able to confirm that it was working when I used Firefox and Privoxy to check my Tor status at...
        status

        When I then used my script, I kept getting error message...
        500 Chunked must be last Transfer-Encoding 'identity'
        Here's my program...
        <tor_test.pl>
        #!/usr/bin/perl -w use strict; use WWW::Mechanize; use HTTP::Cookies; # this script will test to see how WWW::Mechanize works with Tor sub main { my $cookie_jar = HTTP::Cookies->new( file => 'cookies.dat' +, autosave => 1, hide_cookie2 => 1 ); my $bot = WWW::Mechanize->new; $bot->max_redirect(100); $bot->cookie_jar($cookie_jar); $bot->add_header(Accept => 'text/xml,application/xml,application/x +html+xml,text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5'); $bot->add_header('Accept-Language' => 'en-us,en;q=0.5'); $bot->add_header('Accept-Charset' => 'ISO-8859-1,utf-8;q=0.7,*;q=0 +.7'); $bot->add_header('Cache-Control' => 'max-age=0'); # port 8118 for privoxy $bot->proxy('http', 'http://127.0.0.1:8118'); $bot->agent('Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1. +8.0.3) Gecko/20060426 Firefox/1.5.0.3'); $bot->timeout(600); $bot->stack_depth("3"); my $url = 'http://serifos.eecs.harvard.edu/cgi-bin/ipaddr.pl?tor=1 +'; my $response = $bot->get($url); my $content = $bot->content; print ("$content"); print ("fin"); } &main;
        </tor_test.pl>

        The result was...
        500 Chunked must be last Transfer-Encoding 'identity'
        Initially, I thought that the problem was due to the fact that the timeout was not set long enough, but after setting the timeout to a range of values from small to very large, I still get the same problem. I also noticed that when I stepped through the code, the timeout did not seem to have any impact on how quickly the 500 was generated (instantaously). So, I edited the privoxy config file to increase logging.
        <privoxy config>
        debug 16
        </privoxy config>

        Now I restart privoxy and "tail -f /var/log/privoxy/logfile"
        <privoxy logfile>
        Aug 30 13:00:37 Privoxy(-1208476752) Request: serifos.eecs.harvard.edu +/cgi-bin/ipaddr.pl?tor=1 Aug 30 13:00:37 Privoxy(-1208476752) Writing: Aug 30 13:00:38 Privoxy( +-1208476752) Writing: GET /cgi-bin/ipaddr.pl?tor=1 HTTP/1.1 Cache-Control: max-age=0 Accept: text/xml,application/xml,application/xhtml+xml,text/html;q=0.9 +,text/plain;q=0.8,image/png,*/*;q=0.5 Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7 Accept-Language: en-us,en;q=0.5 Host: serifos.eecs.harvard.edu User-Agent: Mozilla (X11; I; Linux 2.0.32 i586) Connection: close Aug 30 13:00:38 Privoxy(-1208476752) Writing: HTTP/1.1 200 OK Date: Wed, 30 Aug 2006 20:00:38 GMT Server: Apache/1.3.34 (Debian) Transfer-Encoding: identity Content-Type: text/html; charset=iso-8859-1 Connection: close Aug 30 13:00:38 Privoxy(-1208476752) Writing: <!doctype html public "- +//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd"> <html> <head><script>function PrivoxyWindowOpen(){return(null);}</script> <title>Tor Test Results</title> <meta name="Author" content="Geoffrey Goodell"> <meta http-equiv="Content-Type" content="text/html; charset=utf-8"> <meta http-equiv="Content-Style-Type" content="text/css"> <link rel="stylesheet" type="text/css" href="http://serifos.eecs.harva +rd.edu/style.css"> </head> <body> You seem to be using Tor! <p>You connected to this site from <b>140.247.62.119</b>, which is a v +alid <a href="http://tor.eff.org/">Tor</a> exit node named <b>serifos</b>. Congratulations!</p> </body>

        </privoxy logfile>

        So, you can see that I am getting a response from the remote server and everything is working!! But for some reason WWW::Mechanize doesn't like the response from Privoxy and issues the 500 error rather than accept the results. I grep'd the perl code and found several references...

        find /usr/lib/perl5 -exec grep -H 'Transfer-Encoding' '{}' \;
        ...and this seems to be the line where it is choking...
        /usr/lib/perl5/vendor_perl/5.8.5/Net/HTTP/Methods.pm: die "Chunked must be last Transfer-Encoding '$te'"

        I haven't gotten any further on this problem, if someone else can suggest something, I'd be very appreciative!!

        Edited by planetscape - linkified link and changed pre to code tags

Re: User agent through Privoxy?
by Anonymous Monk on Jun 18, 2006 at 08:21 UTC
    I believe that Priovxy works like a SOCKs proxy.
    Know, RTFM :)
    $ua->proxy( ['http', 'https' ], "http://localhost:8118");
      I tried using http in the proxy settings. When I did that, the agent ignored the proxy altogether.

      Neil Watson
      watson-wilson.ca

        This turns out to be a bug in Privoxy. If you upgrade to the 3.0.5-Beta version, the chunked problem goes away.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://555996]
Approved by ww
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others romping around the Monastery: (12)
As of 2014-09-30 13:47 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    How do you remember the number of days in each month?











    Results (371 votes), past polls