Beefy Boxes and Bandwidth Generously Provided by pair Networks
Just another Perl shrine
 
PerlMonks  

Mechanize Returns Garbled Content

by Anonymous Monk
on Jun 28, 2011 at 22:09 UTC ( #911860=perlquestion: print w/ replies, xml ) Need Help??
Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

I have this script where the content that it return is human-readable on my machine. But when I run the same script on another machine, it returns garbled message.

Both machines are Redhat 5.
#!/usr/bin/perl use strict; use WWW::Mechanize; my $m = WWW::Mechanize->new; $m->get( 'http://www.google.com' ); print $m->content;

Comment on Mechanize Returns Garbled Content
Download Code
Re: Mechanize Returns Garbled Content
by ikegami (Pope) on Jun 28, 2011 at 22:22 UTC
    What do you get from
    print $m->response->headers->as_string;
      Both machines returned some header information. On the 'good' machine, it returned this.
      Cache-Control: private, max-age=0 Connection: close Date: Tue, 28 Jun 2011 22:45:06 GMT Server: gws Content-Type: text/html; charset=ISO-8859-1 Expires: -1 Client-Date: Tue, 28 Jun 2011 22:43:19 GMT Client-Peer: xx.xx.xx.xx:80 Client-Response-Num: 1 Set-Cookie: PREF=ID=c390276a4d39152b:FF=0:TM=1309301106:LM=1309301106: +S=OOuAOCNBYRLJCHWp; expires=Thu, 27-Jun-2013 22:45:06 GMT; path=/; do +main=.google.com Set-Cookie: NID=48=H5aMKdl8PlT40vE7xU3rxWQ0Py7lY4AXt8L-BJ9q3ZIp4QN8riF +AnUTh_gYtX6s_dG-pf3FFPKLs1M80BC2z3SDbma5vGWFi_h0wgCmHSbQzCYFW0nD2KoyE +I4FxEukb; expires=Wed, 28-Dec-2011 22:45:06 GMT; path=/; domain=.goog +le.com; HttpOnly Title: Google X-Meta-Description: Search the world's information, including webpages +, images, videos and more. Google has many special features to help y +ou find exactly what you're looking for. X-Meta-Robots: noodp X-XSS-Protection: 1; mode=block
      On the 'bad' machine, it returned this.
      Cache-Control: private, max-age=0 Connection: close Date: Tue, 28 Jun 2011 22:45:19 GMT Server: gws Content-Encoding: gzip Content-Length: 6075 Content-Type: text/html; charset=UTF-8 Expires: -1 Client-Date: Tue, 28 Jun 2011 22:45:19 GMT Client-Peer: xx.xx.xx.xx:80 Client-Response-Num: 1 Set-Cookie: PREF=ID=ee94fe222925b98d:FF=0:TM=1309301119:LM=1309301119: +S=93SJpaSxcxzXIjMh; expires=Thu, 27-Jun-2013 22:45:19 GMT; path=/; do +main=.google.com Set-Cookie: NID=48=Zvuevcb_CgIY6rP-Gq65L1oR6r41cs3nYFNAFNoYfwOqFZxcBhr +bW_x4PTjTfgVmh7fovclmf2dWwJu6F-c6NGvMQhdoASsxN07mlNVP1Pi7XbeL6LeqkhJ9 +Jvy8wCj7; expires=Wed, 28-Dec-2011 22:45:19 GMT; path=/; domain=.goog +le.com; HttpOnly X-XSS-Protection: 1; mode=block
      Both response are different and the 'bad' machine seems to think that the content is gzip?

        Content-Encoding: gzip accounts for the garbling, but WWW::Mechanize uses ->decoded_content to handle that. Try upgrading

        • WWW::Mechanize
        • IO::Uncompress::Gunzip
        • HTTP::Message

        "Line 5 from the good machine: Content-Type: text/html; charset=ISO-8859-1
        Line 7 from the "bad" machine: Content-Type: text/html; charset=UTF-8
        "

        Could the bolded difference make a difference?

Re: Mechanize Returns Garbled Content
by trwww (Priest) on Jun 28, 2011 at 22:31 UTC

    or what about:

    print $m->res->decoded_content

    ?

    EDIT: your code works fine for me... maybe there is a proxy between you and google that compresses the data?

    EDIT 2: I didn't see where you said it was working ok on your machine... my guess is still some type of proxy between the machine it is not working properly on and google.

      On the 'good' machine, it returned the google content page. On the 'bad' machine, it did not return anything.
      I spoke to our admin people and they confirmed that there is no proxy sitting between the 'bad' machine and google.
Re: Mechanize Returns Garbled Content
by bichonfrise74 (Vicar) on Jun 29, 2011 at 00:38 UTC
    Maybe you should look at what you are sending? Eg. run tcpdump when you run your script?
Re: Mechanize Returns Garbled Content
by Anonymous Monk on Jun 29, 2011 at 17:08 UTC
    As Ikegami suggested, I tried to update 'IO::Compress::Gunzip' on the 'bad' machine and found out that that module was not even installed.

    When I installed it, everything became okay where the content is no longer garbled and everything is human readable.

    Question:
    Is this a bug in Mechanize? Shouldn't it fail when it cannot gunzip a gzip content?

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://911860]
Approved by ikegami
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others romping around the Monastery: (4)
As of 2014-12-20 06:39 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    Is guessing a good strategy for surviving in the IT business?





    Results (95 votes), past polls