Beefy Boxes and Bandwidth Generously Provided by pair Networks
There's more than one way to do things
 
PerlMonks  

Checking contents of fetched URL.

by ajay.awachar (Acolyte)
on Jan 29, 2010 at 22:24 UTC ( [id://820411]=perlquestion: print w/replies, xml ) Need Help??

ajay.awachar has asked for the wisdom of the Perl Monks concerning the following question:

Hi All, I'm facing an issue while getting the actual contents of fetched URL. My requirement is to hit certain URL check its content if it contains correct search string in it or not. But I'm getting HASH code in the content. Following is the sample code snippet.

#!/usr/bin/perl use strict; use LWP::UserAgent; use LWP::Simple; ## set URL my $url = "http://google.com"; my $ua = LWP::UserAgent->new(); my $res = $ua->get($url); $res->content_type('text/html'); my $response = $res->content; print "Response Type = ".$res->content_type; print "\nResponse Status ".$res->status_line." Res code "; ## view print "\nResponse = $res->content \n";

Following is the Result I'm getting from above code.

Response Type = text/html Response Status 200 OK Res code Response = HTTP::Response=HASH(0x8401064)->content

How can I get the actual text content?

Please Help.

Thanks

-Ajay

Replies are listed 'Best First'.
Re: Checking contents of fetched URL.
by holli (Abbot) on Jan 29, 2010 at 22:54 UTC
    The presence of ->content in your output should have given you a hint. The term $res->content does not get interpolated, only $res does.

    You want: print "\nResponse = ", $res->content, " \n"; just as you did in the two lines above.


    holli

    You can lead your users to water, but alas, you cannot drown them.

      I got your point on interpolation problem with $res->content. But even after replacing it with

       print "\nResponse = ", $res->content, " \n";

      I am still not getting the exact contents. I'm not getting any output for $res->content. I even tried  print "Response =  ".$res->as_string(); but it gives me some errors regarding unavailability of HTML/HeadParser.pm. Any idea on this.

      Ajay
Re: Checking contents of fetched URL.
by ikegami (Patriarch) on Jan 29, 2010 at 23:43 UTC
    In addition, I'd like to point out it pretty much never makes sense to call ->content. You want
    $res->decoded_content( charset => 'none' ) or $res->decoded_content()

    The former returns the file the server returned as a string of bytes.

    The latter returns the file the server returned as a string of characters, if possible. That is to say, it first removes the character encoding from text responses (text/plain, text/html, etc). Recent versions also remove the character encoding from XML responses (application/xml).

Re: Checking contents of fetched URL.
by blakew (Monk) on Jan 30, 2010 at 07:39 UTC
    You can also use HTML::TreeBuilder to parse the HTML and use HTML::Element's dumping methods for pretty-printing:
    my $tree = HTML::TreeBuilder->new; # empty tree $tree->parse( $res->content ); $tree->eof; # tells it to parse the whole thing # Straight content dump $tree->dump; # HTML-formatted print $tree->as_HTML; # Just the text print $tree->as_trimmed_text; # Cleanup the tree $tree->delete;
Re: Checking contents of fetched URL.
by Anonymous Monk on Jan 29, 2010 at 22:51 UTC
    # perldoc HTTP::Response NAME HTTP::Response - HTTP style response message SYNOPSIS Response objects are returned by the request() method of the "LWP::UserAgent": # ... $response = $ua->request($request) if ($response->is_success) { print $response->content; } else { print STDERR $response->status_line, "\n"; }

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://820411]
Approved by superfrink
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others lurking in the Monastery: (4)
As of 2024-04-20 02:24 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found