|Just another Perl shrine|
Re: [OT] Ethical and Legal Screen Scrapingby spiritway (Vicar)
|on Jul 27, 2005 at 02:38 UTC||Need Help??|
This seems to be a really hot topic...
First, the article by Mr. Salzenberg alludes to a Federal law requiring spiders to obey the robots.txt file, but he unfortunately fails to cite to that law. It isn't even clear whether he is referring to statute or case law, the distinction being crucial.
I think it's clear that if someone has troubled to create a robots.txt file, he or she intends that spiders follow it. True - unless directories are password protected, people or spiders can access them. However, that doesn't make it legal or ethical to violate the request of the robots.txt file.
To my way of thinking, "ethics" is more or less applying the Golden Rule, or not doing to people what you would find objectionable. It also includes not doing what *they* find objectionable, within reason. In this case, even if I don't mind spiders sucking up my Website, being ethical would require that I not do this to others, if they ask me not to.
The question of what a Webmaster intended is simple to resolve - ask him or her. People do things for various reasons. Some might be trying to protect their Websites from Googlebot, but have no objection to you scraping it. Others might object to anyone using bandwidth unless there is a human being doing it - perhaps to view the ads on the site, or for other imponderable reasons. So, to me, the best way to resolve the matter is to ask the Webmaster.
As for the law, I doubt very much that anything most of us do will ever come to the attention of the authorities, unless someone sucks up a whole commercial Website and presents it as their own. Even then, the most likely result would be a sternly-worded "cease and desist" letter from Boyd, Dewey, Cheetham, & Howe, LLC.
I think that ethics is (are?) a personal issue where opinions are likely to vary widely. Even if everyone tries to abide by the Golden Rule, people are so widely divergent in their tastes that there is likely to be much disagreement. This is one of the reasons why laws are made - to enforce what is usually an unsatisfactory compromise.