Beefy Boxes and Bandwidth Generously Provided by pair Networks
There's more than one way to do things
 
PerlMonks  

Re^3: User agent through Privoxy?

by Anonymous Monk
on Aug 30, 2006 at 20:28 UTC ( #570473=note: print w/ replies, xml ) Need Help??


in reply to Re^2: User agent through Privoxy?
in thread User agent through Privoxy?

Hi Neil,
I am having the very same problem. I was trying to run WWW::Mechanize through privoxy, which in turn was forwarding everything to Tor so I could run my scripts anonymously. I set everything up on my Linux machine and was able to confirm that it was working when I used Firefox and Privoxy to check my Tor status at...
status

When I then used my script, I kept getting error message...
500 Chunked must be last Transfer-Encoding 'identity'
Here's my program...
<tor_test.pl>

#!/usr/bin/perl -w use strict; use WWW::Mechanize; use HTTP::Cookies; # this script will test to see how WWW::Mechanize works with Tor sub main { my $cookie_jar = HTTP::Cookies->new( file => 'cookies.dat' +, autosave => 1, hide_cookie2 => 1 ); my $bot = WWW::Mechanize->new; $bot->max_redirect(100); $bot->cookie_jar($cookie_jar); $bot->add_header(Accept => 'text/xml,application/xml,application/x +html+xml,text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5'); $bot->add_header('Accept-Language' => 'en-us,en;q=0.5'); $bot->add_header('Accept-Charset' => 'ISO-8859-1,utf-8;q=0.7,*;q=0 +.7'); $bot->add_header('Cache-Control' => 'max-age=0'); # port 8118 for privoxy $bot->proxy('http', 'http://127.0.0.1:8118'); $bot->agent('Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1. +8.0.3) Gecko/20060426 Firefox/1.5.0.3'); $bot->timeout(600); $bot->stack_depth("3"); my $url = 'http://serifos.eecs.harvard.edu/cgi-bin/ipaddr.pl?tor=1 +'; my $response = $bot->get($url); my $content = $bot->content; print ("$content"); print ("fin"); } &main;
</tor_test.pl>

The result was...
500 Chunked must be last Transfer-Encoding 'identity'
Initially, I thought that the problem was due to the fact that the timeout was not set long enough, but after setting the timeout to a range of values from small to very large, I still get the same problem. I also noticed that when I stepped through the code, the timeout did not seem to have any impact on how quickly the 500 was generated (instantaously). So, I edited the privoxy config file to increase logging.
<privoxy config>
debug 16
</privoxy config>

Now I restart privoxy and "tail -f /var/log/privoxy/logfile"
<privoxy logfile>
Aug 30 13:00:37 Privoxy(-1208476752) Request: serifos.eecs.harvard.edu +/cgi-bin/ipaddr.pl?tor=1 Aug 30 13:00:37 Privoxy(-1208476752) Writing: Aug 30 13:00:38 Privoxy( +-1208476752) Writing: GET /cgi-bin/ipaddr.pl?tor=1 HTTP/1.1 Cache-Control: max-age=0 Accept: text/xml,application/xml,application/xhtml+xml,text/html;q=0.9 +,text/plain;q=0.8,image/png,*/*;q=0.5 Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7 Accept-Language: en-us,en;q=0.5 Host: serifos.eecs.harvard.edu User-Agent: Mozilla (X11; I; Linux 2.0.32 i586) Connection: close Aug 30 13:00:38 Privoxy(-1208476752) Writing: HTTP/1.1 200 OK Date: Wed, 30 Aug 2006 20:00:38 GMT Server: Apache/1.3.34 (Debian) Transfer-Encoding: identity Content-Type: text/html; charset=iso-8859-1 Connection: close Aug 30 13:00:38 Privoxy(-1208476752) Writing: <!doctype html public "- +//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd"> <html> <head><script>function PrivoxyWindowOpen(){return(null);}</script> <title>Tor Test Results</title> <meta name="Author" content="Geoffrey Goodell"> <meta http-equiv="Content-Type" content="text/html; charset=utf-8"> <meta http-equiv="Content-Style-Type" content="text/css"> <link rel="stylesheet" type="text/css" href="http://serifos.eecs.harva +rd.edu/style.css"> </head> <body> You seem to be using Tor! <p>You connected to this site from <b>140.247.62.119</b>, which is a v +alid <a href="http://tor.eff.org/">Tor</a> exit node named <b>serifos</b>. Congratulations!</p> </body>

</privoxy logfile>

So, you can see that I am getting a response from the remote server and everything is working!! But for some reason WWW::Mechanize doesn't like the response from Privoxy and issues the 500 error rather than accept the results. I grep'd the perl code and found several references...

find /usr/lib/perl5 -exec grep -H 'Transfer-Encoding' '{}' \;
...and this seems to be the line where it is choking...
/usr/lib/perl5/vendor_perl/5.8.5/Net/HTTP/Methods.pm: die "Chunked must be last Transfer-Encoding '$te'"

I haven't gotten any further on this problem, if someone else can suggest something, I'd be very appreciative!!

Edited by planetscape - linkified link and changed pre to code tags

( keep:0 edit:12 reap:0 )


Comment on Re^3: User agent through Privoxy?
Select or Download Code

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://570473]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others imbibing at the Monastery: (4)
As of 2014-10-20 23:41 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    For retirement, I am banking on:










    Results (93 votes), past polls