Beefy Boxes and Bandwidth Generously Provided by pair Networks
No such thing as a small change
 
PerlMonks  

Downloading of content for personal use

by wombat (Curate)
on Sep 01, 2000 at 23:04 UTC ( #30782=monkdiscuss: print w/ replies, xml ) Need Help??

Here's a nice little murmuring fart to start a ball rolling. Are Perl scripts which go out on the web and grab dynamic content and bring it back to the mother machine for processing "wrong"? I beleive the crux of the issue here is what you do with it.

Currently I have a number of bots that dutifully get up at 6:24:00 each morning and get the content for my "Personal Newspaper" Then when I get up in the morning, I pop into my internal webpage and there it is. Among other things, I see the web comics that I like the most, as well as the current Perlmonks $norm rating, along with the calculated thresholds of $norm*2, $norm*3 etc. In the Fun Stuff section of the Code Catacombs, there's a script which uses Lynx to get the daily content of User Friendly and forward it on elsewhere. Yohimbe, who says he is responsible for putting together the UF systems, said that he was wondering what all those lynx requests for graphics were about. I would be interested in hearing more about how much of it goes on, whether or not there are a lot of evil people vs. a lot of benign souls like us who use it for personal use, and if anything has been heard from other web comic companies, and their takes on it.

The chief reason that I feel so pasisonately about this, is a few months ago, my ISP shut me down because a publishing company had complained that I was reposting copyrighted information without permission. They had discovered my newspaper without my telling ANYONE about it, and not providing any links to the rest of the web for bots to follow. No questions asked, my ISP blew my connection out of the water, and THEN called me to inform me they had done so. It was only after I signed a note saying that I'd take it down that they let me back on. Nowatimes, my newspaper is protected with lots of passwords, and is scrambled unless someone comes and enters a command from within my system, whereupon it only unscrambles for 10 minutes. Needless to say I don't want this happening again, but my question still stands. Am I so wrong to be doing this? Do we all do it? Do the publishers really care about folks like us? I put this to you to discuss.

~W

Comment on Downloading of content for personal use
RE: Downloading of content for personal use
by chromatic (Archbishop) on Sep 02, 2000 at 00:04 UTC
    At the risk of simplifying some things and complicating others, I'll draw a parallel to television and VCRs. With regard to broadcast channels (or unscrambled satellite channels), the signals are available, with the right equipment, cost-free.

    You could say the same about much web content.

    In the case of broadcast television, the costs are obviously underwritten by advertising content -- and occasionally you'll see banner ads on some of the more forward-thinking web sites out there.

    Some VCRs and VCR-like devices have mechanisms to bypass commercials during playback or in pseudo-real time. Besides that, you can always change the channel, leave the room, check your e-mail, or just press MUTE on the remote control. There is an implicit understanding that a non-zero portion of watchers will not ignore the commercials, but the option is to view them by default.

    Compare that to ad viewing on web sites. By default, browsers do not block ads and they do display images. Installing a filtering proxy or disabling image autoloading bypasses this, as does using a non-graphical browser or an automated process through LWP. Again, there's an implicit understanding that, by default, a non-zero portion of browsers will see the ads.

    Just as with television, there are no requirements that a human will actively see the advertisement.

    Having said all of that, I would suggest that this applies to personal use only. Certainly rebroadcasting a television program with the original commercials stripped out (or worse yet, replaced with your own commercials) would be rather immoral (and even less legal). The same idea seems to apply nicely to web content. Yes, it's easy to grab the latest UF comic and mirror it on my web site (and that's nicer than linking it right off of Yohimbe's server, but still unethical), but as the company derives some income from ad revenues, I would be depriving them of that income for other people -- by default.

    If web sites make certain aspects of their content available, say through an RSS file, that would be fine. Anything else? Personal use seems to be okay, but public redistribution is right out.

(jeffa) RE: Downloading of content for personal use
by jeffa (Chancellor) on Sep 02, 2000 at 01:27 UTC
    This reminds me of a simlar situation that the last company I worked for was in. We grabbed item content from several well established online vendors - some vendors would send us their item data, but some would refuse to so. For the ones that refused, well, we got their goods anyways - via a web bot (and Parse::RecDescent and some queue daemons).

    Personally, I see nothing wrong with this, simply because that information was made public by the vendor themselves. You just have to navigate through their web site to get the info - why not have a bot do it for you?

    I think the main reason that Amazon (oops, I mean Vendor X) didn't want to just give us the data was because it would require them to hire/delegate someone to do the task.

    But we were all a little concerned that maybe we were stepping over the line of intellectual property. In the end, Vendor X (or any of the other non-participating vendors) never contacted us with a cease and desist warning.

    I have always been a subscriber to "if someone makes data they own publically available, then it should be publicly maleable" - meaning that bots can grab data and do what every the heck you want to with it - just as long as that data is not copyrighted. If you have to crack a password to get the data, it's not legal.

    Jeff

RE: Downloading of content for personal use
by turnstep (Parson) on Sep 02, 2000 at 03:37 UTC

    > ...a few months ago, my ISP shut me down because a
    > publishing company had complained ...

    > No questions asked, my ISP blew my connection out of
    > the water, and THEN called me to inform me they had
    > done so. It was only after I signed a note saying that
    > I'd take it down that they let me back on. ...

    This is a little off-topic, but you really should get yourself a new ISP. Suspend first, notify later is a really crappy policy. Some ISPs are more reasonable than yours seems to be. If you don't mind, why not tell us who it is, so we can avoid them like the plague. Unless, of course, they sue perlmonks for libel. :/

      Its not libel if its true. Besides which, I don't think companies get the same protection from libel as ordinary folk... something about being in the public arena.
      This is a little off-topic, but you really should get yourself a new ISP. Suspend first, notify later is a really crappy policy.
      Unfortunately, it is pretty standard in .uk - there is a precident that even leaving *news items* on your news spooler that someone has sent a legal notification in on (for libel in this case) that weren't even posted from your isp is grounds for the complainant to sue the *ISP* for bignums (presumably on the basis that many offenders are either untracable or judgement-proof).
      There was a reasonably recent case where an online newsletter website was shut down - and in order to get it back the content providers were told they needed a signed statement from their lawyer that no postings libelous to the complainant would ever be posted (note, there *had* been no such postings to date, the complainant sent in a nasty-gram on the basis they *might* write something he didn't like in the future.
      Add this to the RIP bill and BT's heel-dragging over Flatrate, and you don't have to wonder why England isn't going to be the "ideal home for ecommerce" the government used as an election platfor
        Ah yes, Laurence Godfrey vs Demon. Here is the story. And a legal analysis.

        Anyone who wants can go back to an old (circa 1993) FAQ to get a sense of what he was like. (Search for his name.)

      Something like this happened to me a few years back.
      With a few firm words about the legal implications of their actions they restored my access with credit for the lost days. Needless to say I dropped icubed.com as a service provider at the end of that month. (and never looked back)

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: monkdiscuss [id://30782]
Approved by root
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others making s'mores by the fire in the courtyard of the Monastery: (8)
As of 2014-10-31 23:11 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    For retirement, I am banking on:










    Results (225 votes), past polls