For me, there's just one "don't":
- Don't piss off the owners of the site.
From that rule, many others can be deduced:
- Obey robots.txt.
- Don't flood a site.
- Don't republish, especially not anything that might
- Be very conservative when visiting sites that maintain
themselves by showing ads. Anytime you fetch something
without fetching the ad(s), it costs them money, without
any gain for them.
- For anything you need to register for, don't do anything
that conflicts with their terms of service.
Remember that your robot will be a guest in other peoples
territories. Act accordingly.