Beefy Boxes and Bandwidth Generously Provided by pair Networks
Keep It Simple, Stupid
 
PerlMonks  

Re: Any idea why LWP:Simple doesn't like this particular website?

by epoptai (Curate)
on Jul 02, 2001 at 03:39 UTC ( #93080=note: print w/ replies, xml ) Need Help??


in reply to Any idea why LWP:Simple doesn't like this particular website?

Update: Tried lwp on that site and it works.

perl -e "use LWP::Simple; getprint 'http://www.theredkitchen.net/'"

It's possible for a site to block lwp transfers via $ENV{'HTTP_USER_AGENT'} on the server side. You could try using LWP::UserAgent to specify your own agent:

$ua = LWP::UserAgent->new; $ua->agent('Odyssey/2001');

--
Check out my Perlmonks Related Scripts like framechat, reputer, and xNN.


Comment on Re: Any idea why LWP:Simple doesn't like this particular website?
Select or Download Code
Re: Any idea why LWP:Simple doesn't like this particular website?
by Cody Pendant (Prior) on Jul 02, 2001 at 06:02 UTC
    OK I'm using the following script to do it, and it returns absolutely nothing except "The script has successfully started up.":
    #!/usr/bin/perl -w $| = 1; use diagnostics; use CGI::Carp qw(fatalsToBrowser); use LWP::Simple; print "Content-type: text/html\n\n"; print "The script has successfully started up.\n\n<BR><BR>"; # just to + prove there's nothing wrong $theURL = 'http://www.theredkitchen.net/'; unless($doc=get($theURL)){ print "can't get the URL"; die "$!"; } print $doc; exit;
      As long as you've got -w and diagnostics you might as well use strict and put them to good use. Also declare your variables with my so typos will be apparent. When that is done the only error left is this enigmatic line:
      die "$!";
      which should just be exit so it won't try to print the empty $doc and get that nasty uninitialized value warning. Try this:
      #!/usr/bin/perl -w use strict; # added use diagnostics; use CGI::Carp qw(fatalsToBrowser); use LWP::Simple; my$doc; my$theURL = 'http://www.theredkitchen.net/'; print "Content-type: text/html\n\n"; print "The script has successfully started up.\n\n<BR><BR>"; unless($doc=get($theURL)){ print "can't get the URL"; exit; # added } print $doc; exit;
      This code is tested and works. If it doesn't work for you the problem may lie elsewhere...
        OK, thanks for your help, but it still behaves exactly the same way. I'm launching this script by going to/reloading its URL with a browser by the way.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://93080]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others cooling their heels in the Monastery: (11)
As of 2014-09-18 07:08 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    How do you remember the number of days in each month?











    Results (108 votes), past polls