Clear questions and runnable code get the best and fastest answer |
|
PerlMonks |
comment on |
( [id://3333]=superdoc: print w/replies, xml ) | Need Help?? |
You seem to be confused about ethics.
Behaving ethically is not defined by anyone else's ability to prove that you did or did not behave ethically. If you do something unethical, it is unethical whether or not you're caught. If you find yourself having to make excuses for your behaviour, the odds are very good that you're behaving unethically. If you write a program to scrape another website, even if it is just for personal use, courtesy and ethics says that you should pay attention to robots.txt. Sure, you can do by hand anything that you automate. But when you automate you're likely to do a lot more of it than when you do it by hand. And you're likely to do it a lot faster. This has implications for the website that you're visiting, and it makes sense that website operators would ask you to be particularly polite to them. If you say, "Oh, this is just for personal use" and turn a poorly written spider loose on a site, you're being rude and unethical. The website operator may well choose to repay your rudeness in kind and block you. They don't even have to go to court to do it either - they just notice that you're a bandwidth hog and lock you out. But you asked several hypothetical questions. Here are not so hypothetical answers. Someone might recognize that you didn't just use your browser because of the speed with which you hit the site, because of your user agent, because they get access to your computer and find the program that you used to do it. There are other things that might strike them as suspicious, but the above is a good starting list. In reply to Re^2: [OT] Ethical and Legal Screen Scraping
by tilly
|
|