Beefy Boxes and Bandwidth Generously Provided by pair Networks
The stupid question is the question not asked
 
PerlMonks  

User agent through Privoxy?

by neilwatson (Curate)
on Jun 17, 2006 at 18:03 UTC ( #555996=perlquestion: print w/ replies, xml ) Need Help??
neilwatson has asked for the wisdom of the Perl Monks concerning the following question:

How does one go about using a user agent like lWP::UserAgent through Privoxy? I believe that Priovxy works like a SOCKs proxy. The user agent seems to ignore the proxy and connects directly. Is there another agent I can use that will work?
# Set agent and proxy $ua = LWP::UserAgent->new; $ua->proxy('socks', "http://localhost:8118"); # remove cookies unlink "/tmp/cookies.txt"; $ua->cookie_jar({ file => "/tmp/cookies.txt" }); $response = $ua->get($url); if ($response->is_success){ print $response->content; }else { print $response->status_line; }

Neil Watson
watson-wilson.ca

Comment on User agent through Privoxy?
Download Code
Re: User agent through Privoxy?
by ioannis (Priest) on Jun 17, 2006 at 19:22 UTC
    It does not work for me either; but proxy works when I use $ua->env_proxy . The env_proxy paramameter to LWP::UserAgent->new( env_proxy=>1) is also working fine.
Re: User agent through Privoxy?
by ioannis (Priest) on Jun 17, 2006 at 21:39 UTC
    Here is the sample code:
    use LWP::UserAgent; $ua = LWP::UserAgent->new; $ENV{ http_proxy } ='http://localhost:80'; $ua->env_proxy; print $ua->get( 'http://www.google.com' )->as_string;
    With LogLevel set to 'debug', forward proxing is confirmed from error.log of my local server:
    [Sat Jun 17 17:30:57 2006] [debug] proxy_http.c(630): Content-Type: te +xt/html
      Produces an error for me:500 Chunked must be last Transfer-Encoding 'identity'

      Neil Watson
      watson-wilson.ca

        Hi Neil,
        I am having the very same problem. I was trying to run WWW::Mechanize through privoxy, which in turn was forwarding everything to Tor so I could run my scripts anonymously. I set everything up on my Linux machine and was able to confirm that it was working when I used Firefox and Privoxy to check my Tor status at...
        status

        When I then used my script, I kept getting error message...
        500 Chunked must be last Transfer-Encoding 'identity'
        Here's my program...
        <tor_test.pl>
        #!/usr/bin/perl -w use strict; use WWW::Mechanize; use HTTP::Cookies; # this script will test to see how WWW::Mechanize works with Tor sub main { my $cookie_jar = HTTP::Cookies->new( file => 'cookies.dat' +, autosave => 1, hide_cookie2 => 1 ); my $bot = WWW::Mechanize->new; $bot->max_redirect(100); $bot->cookie_jar($cookie_jar); $bot->add_header(Accept => 'text/xml,application/xml,application/x +html+xml,text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5'); $bot->add_header('Accept-Language' => 'en-us,en;q=0.5'); $bot->add_header('Accept-Charset' => 'ISO-8859-1,utf-8;q=0.7,*;q=0 +.7'); $bot->add_header('Cache-Control' => 'max-age=0'); # port 8118 for privoxy $bot->proxy('http', 'http://127.0.0.1:8118'); $bot->agent('Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1. +8.0.3) Gecko/20060426 Firefox/1.5.0.3'); $bot->timeout(600); $bot->stack_depth("3"); my $url = 'http://serifos.eecs.harvard.edu/cgi-bin/ipaddr.pl?tor=1 +'; my $response = $bot->get($url); my $content = $bot->content; print ("$content"); print ("fin"); } &main;
        </tor_test.pl>

        The result was...
        500 Chunked must be last Transfer-Encoding 'identity'
        Initially, I thought that the problem was due to the fact that the timeout was not set long enough, but after setting the timeout to a range of values from small to very large, I still get the same problem. I also noticed that when I stepped through the code, the timeout did not seem to have any impact on how quickly the 500 was generated (instantaously). So, I edited the privoxy config file to increase logging.
        <privoxy config>
        debug 16
        </privoxy config>

        Now I restart privoxy and "tail -f /var/log/privoxy/logfile"
        <privoxy logfile>
        Aug 30 13:00:37 Privoxy(-1208476752) Request: serifos.eecs.harvard.edu +/cgi-bin/ipaddr.pl?tor=1 Aug 30 13:00:37 Privoxy(-1208476752) Writing: Aug 30 13:00:38 Privoxy( +-1208476752) Writing: GET /cgi-bin/ipaddr.pl?tor=1 HTTP/1.1 Cache-Control: max-age=0 Accept: text/xml,application/xml,application/xhtml+xml,text/html;q=0.9 +,text/plain;q=0.8,image/png,*/*;q=0.5 Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7 Accept-Language: en-us,en;q=0.5 Host: serifos.eecs.harvard.edu User-Agent: Mozilla (X11; I; Linux 2.0.32 i586) Connection: close Aug 30 13:00:38 Privoxy(-1208476752) Writing: HTTP/1.1 200 OK Date: Wed, 30 Aug 2006 20:00:38 GMT Server: Apache/1.3.34 (Debian) Transfer-Encoding: identity Content-Type: text/html; charset=iso-8859-1 Connection: close Aug 30 13:00:38 Privoxy(-1208476752) Writing: <!doctype html public "- +//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd"> <html> <head><script>function PrivoxyWindowOpen(){return(null);}</script> <title>Tor Test Results</title> <meta name="Author" content="Geoffrey Goodell"> <meta http-equiv="Content-Type" content="text/html; charset=utf-8"> <meta http-equiv="Content-Style-Type" content="text/css"> <link rel="stylesheet" type="text/css" href="http://serifos.eecs.harva +rd.edu/style.css"> </head> <body> You seem to be using Tor! <p>You connected to this site from <b>140.247.62.119</b>, which is a v +alid <a href="http://tor.eff.org/">Tor</a> exit node named <b>serifos</b>. Congratulations!</p> </body>

        </privoxy logfile>

        So, you can see that I am getting a response from the remote server and everything is working!! But for some reason WWW::Mechanize doesn't like the response from Privoxy and issues the 500 error rather than accept the results. I grep'd the perl code and found several references...

        find /usr/lib/perl5 -exec grep -H 'Transfer-Encoding' '{}' \;
        ...and this seems to be the line where it is choking...
        /usr/lib/perl5/vendor_perl/5.8.5/Net/HTTP/Methods.pm: die "Chunked must be last Transfer-Encoding '$te'"

        I haven't gotten any further on this problem, if someone else can suggest something, I'd be very appreciative!!

        Edited by planetscape - linkified link and changed pre to code tags

Re: User agent through Privoxy?
by Anonymous Monk on Jun 18, 2006 at 08:21 UTC
    I believe that Priovxy works like a SOCKs proxy.
    Know, RTFM :)
    $ua->proxy( ['http', 'https' ], "http://localhost:8118");
      I tried using http in the proxy settings. When I did that, the agent ignored the proxy altogether.

      Neil Watson
      watson-wilson.ca

        This turns out to be a bug in Privoxy. If you upgrade to the 3.0.5-Beta version, the chunked problem goes away.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://555996]
Approved by ww
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others avoiding work at the Monastery: (7)
As of 2014-09-23 21:16 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    How do you remember the number of days in each month?











    Results (241 votes), past polls