Checking contents of fetched URL.

ajay.awachar has asked for the wisdom of the Perl Monks concerning the following question:

Hi All, I'm facing an issue while getting the actual contents of fetched URL. My requirement is to hit certain URL check its content if it contains correct search string in it or not. But I'm getting HASH code in the content. Following is the sample code snippet.

#!/usr/bin/perl
use strict;
use LWP::UserAgent;
use LWP::Simple;


## set URL
my $url = "http://google.com";
my $ua = LWP::UserAgent->new();

my $res = $ua->get($url);
$res->content_type('text/html');
my $response = $res->content;
print "Response Type = ".$res->content_type;
print "\nResponse Status ".$res->status_line." Res code ";
## view

print "\nResponse = $res->content \n";
[download]

Following is the Result I'm getting from above code.

Response Type = text/html Response Status 200 OK Res code Response = HTTP::Response=HASH(0x8401064)->content

How can I get the actual text content?

Please Help.

Thanks

-Ajay

Comment on Checking contents of fetched URL. Download Code

Replies are listed 'Best First'.
Re: Checking contents of fetched URL. by holli (Abbot) on Jan 29, 2010 at 22:54 UTC
The presence of `->content` in your output should have given you a hint. The term `$res->content` does not get interpolated, only `$res` does. You want: `print "\nResponse = ", $res->content, " \n";` just as you did in the two lines above. holli You can lead your users to water, but alas, you cannot drown them.	[reply] [d/l] [select]
Re^2: Checking contents of fetched URL. by ajay.awachar (Acolyte) on Jan 29, 2010 at 23:43 UTC
I got your point on interpolation problem with $res->content. But even after replacing it with `print "\nResponse = ", $res->content, " \n";` I am still not getting the exact contents. I'm not getting any output for $res->content. I even tried `print "Response = ".$res->as_string();` but it gives me some errors regarding unavailability of HTML/HeadParser.pm. Any idea on this. Ajay	[reply] [d/l] [select]
Re^3: Checking contents of fetched URL. by Anonymous Monk on Jan 30, 2010 at 16:18 UTC
HTML::HeadParser is a prerequisite, it is supposed to be installed already.	[reply]
Re: Checking contents of fetched URL. by ikegami (Patriarch) on Jan 29, 2010 at 23:43 UTC
In addition, I'd like to point out it pretty much never makes sense to call `->content`. You want `$res->decoded_content( charset => 'none' ) or $res->decoded_content()` [download] The former returns the file the server returned as a string of bytes. The latter returns the file the server returned as a string of characters, if possible. That is to say, it first removes the character encoding from text responses (text/plain, text/html, etc). Recent versions also remove the character encoding from XML responses (application/xml).	[reply] [d/l] [select]
Re: Checking contents of fetched URL. by blakew (Monk) on Jan 30, 2010 at 07:39 UTC
You can also use HTML::TreeBuilder to parse the HTML and use HTML::Element's dumping methods for pretty-printing: `my $tree = HTML::TreeBuilder->new; # empty tree $tree->parse( $res->content ); $tree->eof; # tells it to parse the whole thing # Straight content dump $tree->dump; # HTML-formatted print $tree->as_HTML; # Just the text print $tree->as_trimmed_text; # Cleanup the tree $tree->delete;` [download]	[reply] [d/l]
Re: Checking contents of fetched URL. by Anonymous Monk on Jan 29, 2010 at 22:51 UTC
`# perldoc HTTP::Response NAME HTTP::Response - HTTP style response message SYNOPSIS Response objects are returned by the request() method of the "LWP::UserAgent": # ... $response = $ua->request($request) if ($response->is_success) { print $response->content; } else { print STDERR $response->status_line, "\n"; }` [download]	[reply] [d/l]


There's more than one way to do things
	PerlMonks