403 forbidden error

by mailmeakhila (Sexton)
on Apr 03, 2012 at 19:41 UTC
mailmeakhila has asked for the wisdom of the Perl Monks concerning the following question:

Hello Monks, I am working on scraping a website. My script is able to get to but not to that particular site. I get 403 forbidden error and it doesnt generate any error logs. I have checked with the site policies and they dont restrict web scraping. My guess is i am not passing all the required parameters to the site. Any help will be highly appreciated. Thank you Akhila.
my $cookie_jar = HTTP::Cookies->new; $cookie_jar->clear; my $ua = LWP::UserAgent->new; $ua->cookie_jar($cookie_jar); $ua->agent('Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)'); $ua->timeout(30); my $response = $ua->get(""); print $response->code; print $response->message;

Replies are listed 'Best First'.
Re: 403 forbidden error
by Old_Gray_Bear (Bishop) on Apr 03, 2012 at 20:43 UTC
    A 403 error means that you can talk to the server, but you do not have the rights and permissions to see the data you are asking for.

    There may be information on the Server side logs, maybe not; depends on the level of paranoia of the NetAdmin. Either way, if you think you should be able to see the data you asked for, send the Admin a note asking why you are getting the 403.

    If you are getting the 403 because you are not providing something to the server, the Admin would still be the one to ask.

    Also bear in mind that you may be mis-reading the TOS for the site and the data is not intended to be publicly available.

    I Go Back to Sleep, Now.


      Hi Thank you. I am using a standalone machine. I am not connected to any other server.
        You mean, you connect to your own webserver, and ask us why you're getting a 403 error? Might as well ask us what's in your pocket.

        If it's your webserver, you ought to know why you have configured in such a way you're getting a 403 error. We will not.

        But regardless, it's not a Perl question. Write the same query in C, Java or Python, and you'll be getting a 403 error.

