Beefy Boxes and Bandwidth Generously Provided by pair Networks
go ahead... be a heretic
 
PerlMonks  

WWW::Mechanize trouble reading a text/plain file on a server

by gargle (Chaplain)
on May 24, 2011 at 07:52 UTC ( [id://906441]=perlquestion: print w/replies, xml ) Need Help??

gargle has asked for the wisdom of the Perl Monks concerning the following question:

My fellow monks,

A stupid question perhaps, this stuff is bugging me for a while now. I tried the usual, I read the docs, I searched perlmonks, I googled... but to no avail.

I try to access a log file on a testlink server but all I get is an empty string if I just do a 'get'. The weird thing is that if I fetch the log file in a temporary file I can read the file.

This doesn't work:

$mech->get( $TLKAUDITSURL ); print $mech->content();

The headers I get in the response object are:

$VAR1 = bless( { '_protocol' => 'HTTP/1.1', '_content' => '', '_rc' => 200, '_headers' => bless( { 'connection' => 'close', 'client-response-num' => 1, 'last-modified' => 'Tue, 24 Ma +y 2011 06:50:00 GMT', 'accept-ranges' => 'bytes', 'date' => 'Tue, 24 May 2011 07 +:29:55 GMT', 'client-peer' => '10.192.39.90 +:80', 'content-length' => '21097062' +, 'client-date' => 'Tue, 24 May +2011 07:29:57 GMT', 'etag' => '"aa-141ea66-4a3ffff +2d7600"', 'content-type' => 'text/plain' +, 'server' => 'Apache/2.2.15 (Un +ix) PHP/5.3.2' }, 'HTTP::Headers' ), '_msg' => 'OK', '_request' => bless( { '_content' => '', '_uri' => bless( do{\(my $o = +'http://tlk/logs/audits.log')}, 'URI::http' ), '_headers' => bless( { 'user-a +gent' => 'WWW-Mechanize/1.68', 'accept +-encoding' => 'gzip', 'refere +r' => 'http://tlk/logs/audits.log' }, 'HTTP: +:Headers' ), '_method' => 'GET', '_uri_canonical' => $VAR1->{'_ +request'}{'_uri'} }, 'HTTP::Request' ) }, 'HTTP::Response' );

This however does work:

$mech->get( $TLKAUDITSURL, ':content_file' => $TMPAUDITSLOG ); open( FH, '<', $TMPAUDITSLOG ); my @check = <FH>; close FH; foreach my $line (@check) { chomp $line; print $line . "\n"; }

In the info returned I find this as an extra:

'handlers' => { 'response_done' => [ { 'callback' => +sub { "DUMMY" } } ] },

What on earth am I missing? I don't want to use a temp file, I just want to access the log file by means of a simple $mech->content() or something similar (but no call to system or any backticks to invoke wget or curl).

--
if ( 1 ) { $postman->ring() for (1..2); }

Replies are listed 'Best First'.
Re: WWW::Mechanize trouble reading a text/plain file on a server
by Corion (Patriarch) on May 24, 2011 at 08:07 UTC

    You'll need to look at what goes over the wire to find the difference.

    My vague guess would be that you get a chunked response but WWW::Mechanize->content does not collect the whole response while the :content_file callback loops to collect the whole response. This would be a bug in LWP::UserAgent or WWW::Mechanize, so both version numbers would be helpful.

    Beware of blindly upgrading to LWP::UserAgent 6+, as it has more requirements on https certificates.

Re: WWW::Mechanize trouble reading a text/plain file on a server
by Anonymous Monk on May 24, 2011 at 08:04 UTC
    What on earth am I missing?

    Proof :D Maybe dump the headers of the response where you use :content_file, you might see a difference.

    There really are no red flags in what you've shown, my conclusion, some servers are just broken.

    It pays to always upgrade WWW:Mechanize/LWP to the latest

    $ pmvers WWW::Mechanize LWP HTTP::Headers HTTP::Request URI HTTP::Resp +onse WWW::Mechanize: 1.68 LWP: 6.02 HTTP::Headers: 6.00 HTTP::Request: 6.00 URI: 1.58 HTTP::Response: 6.01

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://906441]
Approved by Corion
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others learning in the Monastery: (3)
As of 2024-04-24 04:44 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found