Beefy Boxes and Bandwidth Generously Provided by pair Networks
good chemistry is complicated,
and a little bit messy -LW
 
PerlMonks  

Re: User agent through Privoxy?

by ioannis (Abbot)
on Jun 17, 2006 at 21:39 UTC ( [id://556025]=note: print w/replies, xml ) Need Help??


in reply to User agent through Privoxy?

Here is the sample code:
use LWP::UserAgent; $ua = LWP::UserAgent->new; $ENV{ http_proxy } ='http://localhost:80'; $ua->env_proxy; print $ua->get( 'http://www.google.com' )->as_string;
With LogLevel set to 'debug', forward proxing is confirmed from error.log of my local server:
[Sat Jun 17 17:30:57 2006] [debug] proxy_http.c(630): Content-Type: te +xt/html

Replies are listed 'Best First'.
Re^2: User agent through Privoxy?
by neilwatson (Priest) on Jun 17, 2006 at 23:48 UTC
    Produces an error for me:500 Chunked must be last Transfer-Encoding 'identity'

    Neil Watson
    watson-wilson.ca

      Hi Neil,
      I am having the very same problem. I was trying to run WWW::Mechanize through privoxy, which in turn was forwarding everything to Tor so I could run my scripts anonymously. I set everything up on my Linux machine and was able to confirm that it was working when I used Firefox and Privoxy to check my Tor status at...
      status

      When I then used my script, I kept getting error message...
      500 Chunked must be last Transfer-Encoding 'identity'
      Here's my program...
      <tor_test.pl>
      #!/usr/bin/perl -w use strict; use WWW::Mechanize; use HTTP::Cookies; # this script will test to see how WWW::Mechanize works with Tor sub main { my $cookie_jar = HTTP::Cookies->new( file => 'cookies.dat' +, autosave => 1, hide_cookie2 => 1 ); my $bot = WWW::Mechanize->new; $bot->max_redirect(100); $bot->cookie_jar($cookie_jar); $bot->add_header(Accept => 'text/xml,application/xml,application/x +html+xml,text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5'); $bot->add_header('Accept-Language' => 'en-us,en;q=0.5'); $bot->add_header('Accept-Charset' => 'ISO-8859-1,utf-8;q=0.7,*;q=0 +.7'); $bot->add_header('Cache-Control' => 'max-age=0'); # port 8118 for privoxy $bot->proxy('http', 'http://127.0.0.1:8118'); $bot->agent('Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1. +8.0.3) Gecko/20060426 Firefox/1.5.0.3'); $bot->timeout(600); $bot->stack_depth("3"); my $url = 'http://serifos.eecs.harvard.edu/cgi-bin/ipaddr.pl?tor=1 +'; my $response = $bot->get($url); my $content = $bot->content; print ("$content"); print ("fin"); } &main;
      </tor_test.pl>

      The result was...
      500 Chunked must be last Transfer-Encoding 'identity'
      Initially, I thought that the problem was due to the fact that the timeout was not set long enough, but after setting the timeout to a range of values from small to very large, I still get the same problem. I also noticed that when I stepped through the code, the timeout did not seem to have any impact on how quickly the 500 was generated (instantaously). So, I edited the privoxy config file to increase logging.
      <privoxy config>
      debug 16
      </privoxy config>

      Now I restart privoxy and "tail -f /var/log/privoxy/logfile"
      <privoxy logfile>
      Aug 30 13:00:37 Privoxy(-1208476752) Request: serifos.eecs.harvard.edu +/cgi-bin/ipaddr.pl?tor=1 Aug 30 13:00:37 Privoxy(-1208476752) Writing: Aug 30 13:00:38 Privoxy( +-1208476752) Writing: GET /cgi-bin/ipaddr.pl?tor=1 HTTP/1.1 Cache-Control: max-age=0 Accept: text/xml,application/xml,application/xhtml+xml,text/html;q=0.9 +,text/plain;q=0.8,image/png,*/*;q=0.5 Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7 Accept-Language: en-us,en;q=0.5 Host: serifos.eecs.harvard.edu User-Agent: Mozilla (X11; I; Linux 2.0.32 i586) Connection: close Aug 30 13:00:38 Privoxy(-1208476752) Writing: HTTP/1.1 200 OK Date: Wed, 30 Aug 2006 20:00:38 GMT Server: Apache/1.3.34 (Debian) Transfer-Encoding: identity Content-Type: text/html; charset=iso-8859-1 Connection: close Aug 30 13:00:38 Privoxy(-1208476752) Writing: <!doctype html public "- +//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd"> <html> <head><script>function PrivoxyWindowOpen(){return(null);}</script> <title>Tor Test Results</title> <meta name="Author" content="Geoffrey Goodell"> <meta http-equiv="Content-Type" content="text/html; charset=utf-8"> <meta http-equiv="Content-Style-Type" content="text/css"> <link rel="stylesheet" type="text/css" href="http://serifos.eecs.harva +rd.edu/style.css"> </head> <body> You seem to be using Tor! <p>You connected to this site from <b>140.247.62.119</b>, which is a v +alid <a href="http://tor.eff.org/">Tor</a> exit node named <b>serifos</b>. Congratulations!</p> </body>

      </privoxy logfile>

      So, you can see that I am getting a response from the remote server and everything is working!! But for some reason WWW::Mechanize doesn't like the response from Privoxy and issues the 500 error rather than accept the results. I grep'd the perl code and found several references...

      find /usr/lib/perl5 -exec grep -H 'Transfer-Encoding' '{}' \;
      ...and this seems to be the line where it is choking...
      /usr/lib/perl5/vendor_perl/5.8.5/Net/HTTP/Methods.pm: die "Chunked must be last Transfer-Encoding '$te'"

      I haven't gotten any further on this problem, if someone else can suggest something, I'd be very appreciative!!

      Edited by planetscape - linkified link and changed pre to code tags

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://556025]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others admiring the Monastery: (5)
As of 2024-04-18 00:18 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found