Beefy Boxes and Bandwidth Generously Provided by pair Networks
Keep It Simple, Stupid
 
PerlMonks  

Using LWP::UserAgent to Get Web Pages

by jayanth (Initiate)
on Jun 14, 2007 at 15:04 UTC ( [id://621260]=perlquestion: print w/replies, xml ) Need Help??

jayanth has asked for the wisdom of the Perl Monks concerning the following question:

Hi,

I'm using the LWP::UserAgent module to access web pages. Given below is the code which is supposed to return an HTTP::Response object with the contents of the requested page.
#!/usr/bin/perl use LWP::UserAgent; print "Content-type: text/html\n\n"; my $ua = new LWP::UserAgent(); my $url = 'http://www.google.com/'; my $httpresp = $ua->get($url); print $httpresp->content;
When I run this code, I get the message '500 Can't connect to www.google.com:80 (Bad hostname 'www.google.com')'.

I'm running the code on a Fedora Core 6 machine. I access the Internet directly, without any proxy server. I'm not using any firewall either.

Please help.

Thanks,

Jayanth

Replies are listed 'Best First'.
Re: Using LWP::UserAgent to Get Web Pages
by Popcorn Dave (Abbot) on Jun 14, 2007 at 15:54 UTC
    It works just fine on Windows too.

    How many times did you try this? Did it ever return anything for you? The reason I ask is that I seem to recall that Google doesn't really like people writing their own page scrapers so they may have blocked your IP temporarily. They have their own tools they want you to use.

    What you might consider trying is changing the URL that you're trying to access to see if Google is blocking your IP.

    The other question I have is why you have the here doc in that snippet of code, because you're never using it. Or was it part of the larger program?


    Revolution. Today, 3 O'Clock. Meet behind the monkey bars.

    I would love to change the world, but they won't give me the source code

Re: Using LWP::UserAgent to Get Web Pages
by pajout (Curate) on Jun 14, 2007 at 15:25 UTC
    I ran your code on my FC6, it worked fine.
Re: Using LWP::UserAgent to Get Web Pages
by moritz (Cardinal) on Jun 14, 2007 at 18:27 UTC
    What happens if you start this script from the command line, not through CGI?

    Can you access non-Google URLs?

    Somehow this doesn't smell like a perl problem to me - but I could be wrong, of course.

Re: Using LWP::UserAgent to Get Web Pages
by naikonta (Curate) on Jun 15, 2007 at 01:56 UTC
    I'm running the code on a Fedora Core 6 machine. I access the Internet directly, without any proxy server. I'm not using any firewall either.
    With such situation, a) did it work from your browser? b) can you confirm the result of the following commands?
    $ HEAD http://www.google.com/ # and $ perl -MLWP::Simple -le 'print head "http://www.google.com" ? "OK" : + "NO"'

    Open source softwares? Share and enjoy. Make profit from them if you can. Yet, share and enjoy!

Re: Using LWP::UserAgent to Get Web Pages
by varian (Chaplain) on Jun 15, 2007 at 07:48 UTC
    To learn more about the cause of the error the following will get you more verbose output from LWP:
    use LWP::Debug qw(+);

    My hinch is that this is either related to your DNS settings or to security settings on your system.

    Try to 'ping' google to check your DNS settings.
    If that works fine then check that the ping cmd also works in the context of a cgi environment executed by your webserver (e.g. include with backticks in a cgi perl prog).

Re: Using LWP::UserAgent to Get Web Pages
by jayanth (Initiate) on Jun 16, 2007 at 16:03 UTC
    Hi,

    Popcorn Dave: It doesn't work for any site.

    moritz: It works if I run the script from the command line!

    naikonta: Websites are accessible from the browser. The second command you suggested - 'perl -MLWP::Simple -le 'print head "http://www.google.com" ? "OK" : "NO"' - returns OK.

    varian: Pinging 'www.google.com' is successful. As for doing it from the CGI environment, I have no idea how to go about it! :-( I'll check it out anyway.

    Thanks for the responses. Hope someone will be able to suggest a solution.

    Jayanth
      What about varians Debug suggestion?
        There was no difference in behaviour.

        Jayanth

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://621260]
Approved by kyle
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others contemplating the Monastery: (5)
As of 2024-04-26 08:35 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found