in reply to LWP and Cloudflare

Try setting the User-Agent string to something a real browser sends, e.g. Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/61.0.3163.100 Safari/537.36.

That's usually enough to get around such "blockades".

Edit: Tried that, getting a 503 now. I also noticed the site is protected by Cloudflare. Now you enter dangerous waters.

While Terms of Use apply only to users that agreed to the terms (read: who are logged in), it's ok to scrape a site even if they don't want you too. Bypassing cloudflare and similar measures however is outright hacking and illegal in most places.


You can lead your users to water, but alas, you cannot drown them.