Beefy Boxes and Bandwidth Generously Provided by pair Networks
XP is just a number
 
PerlMonks  

HTTP::Tiny status 403

by leosole (Novice)
on Sep 23, 2019 at 22:01 UTC ( [id://11106601]=perlquestion: print w/replies, xml ) Need Help??

leosole has asked for the wisdom of the Perl Monks concerning the following question:

Hello, I'm trying to use HTTP::Tiny to get the html of a web page, but I'm getting a 403 Forbidden status. Am I doing something wrong?

This is my code:

use strict; use warnings; use HTML::TreeBuilder; use HTTP::Tiny; use Data::Dumper qw(Dumper); use IO::Socket::SSL; my $link = 'https://www.tudogostoso.com.br/receita/199993-bolo-de-lima +o.html'; my $response = HTTP::Tiny->new->get($link); my $html; print "$response->{status}\t$response->{reason}\n"; print $response->{content}; if ($response->{success}) { $html = $response->{content}; print "entrou\n"; }
And the prints:
403 Forbidden <html><head><title>You have been blocked</title><style>#cmsg{animation +: A 1.5s;}@keyframes A{0%{opacity:0;}99%{opacity:0;}100%{opacity:1;}} +</style></head><body style="margin:0"><p id="cmsg">Please enable JS a +nd disable any ad blocker</p><script type="3cc341b69494194c916377bc-t +ext/javascript">var dd={'cid':'AHrlqAAAAAMAn9GuxP5ZUjsAut-v0g==','hsh +':'C499C5254821BA7F386B459241B3FC','t':'fe'}</script><script src="htt +ps://ct.datado.me/c.js" type="3cc341b69494194c916377bc-text/javascrip +t"></script><script src="https://ajax.cloudflare.com/cdn-cgi/scripts/ +95c75768/cloudflare-static/rocket-loader.min.js" data-cf-settings="3c +c341b69494194c916377bc-|49" defer=""></script></body></html>

Replies are listed 'Best First'.
Re: HTTP::Tiny status 403
by haukex (Archbishop) on Sep 23, 2019 at 22:37 UTC
    You have been blocked ... Please enable JS and disable any ad blocker

    The site is telling you what's wrong... perhaps the site doesn't like automated or too frequent requests? Have you checked the Terms Of Service?

      I think you're right. I tried changing the url, and it worked.
Re: HTTP::Tiny status 403
by Anonymous Monk on Sep 24, 2019 at 00:31 UTC
    Try a different useragent:
    my $response = HTTP::Tiny->new(agent=>'Mozilla/5.0')->get($link);
    The defaults for popular http libs are often blocked due to abuse...
      Got the same error with that useragent :(

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://11106601]
Approved by stevieb
Front-paged by Corion
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others rifling through the Monastery: (6)
As of 2024-04-23 14:39 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found