Beefy Boxes and Bandwidth Generously Provided by pair Networks
Come for the quick hacks, stay for the epiphanies.

SO and AI

by stevieb (Canon)
on May 20, 2024 at 06:53 UTC ( [id://11159541]=perlmeditation: print w/replies, xml ) Need Help??

Stack just monetized all of the data that we, as public advocates of free information, provided.

Is Perlmonks going to do this? If it is, I want all of my content removed immediately. I do not agree to the knowledge I've learned from those before me and I've subsequently shared being sold to anyone without due attribution.

If Perlmonks plans on selling its user data to anyone, I outright refuse to take part, and want my data to be excluded entirely.


Replies are listed 'Best First'.
Re: SO and AI
by marto (Cardinal) on May 20, 2024 at 07:37 UTC
Re: SO and AI
by erzuuli (Cannon) on May 20, 2024 at 14:11 UTC
    Is Perlmonks going to do this?

    Obviously not. This site was built by perl programmers, for perl programmers. You are we.

    No one has approached us (the admins) to offer us piles of dough for a dump of our data; and if perchance someone did, we'd say no.

    Why? Aside from the ethical ordure which such an offer would represent, we have no need of piles of dough. Our operating costs are zero. That's why don't have ads.
    And if you think Corion and I are even remotely susceptible to the lure of lucre for betraying the trust of our (sadly, dwindling) users, then, you really don't understand how this place works.

    Now, all that being said, we don't own the servers on which PerlMonks runs. If Pair decided to sell our data residing on their servers, we wouldn't have much means to stop them. I suppose we could try to get TPF to bring legal leverage to bear, but (IANAL) I doubt they'd have much legal leg to stand on in that regard either.

    Beyond that, please see How can I wipe every trace of myself from PerlMonks?

    And Don't Panic.

      There is a difference between sponsoring and owning.

      Pair could sell the data on the servers as much as my landlord could sell my furniture or my diaries.

      They are nowhere mentioned any more than as sponsors, and The Perl Foundation is obviously the legal owner of the site.

      What's rather wondering me is in how far an AI can be excluded from crawling data here, if search engines aren't...

      Cheers Rolf

      (addicted to the Perl Programming Language :)
      see Wikisyntax for the Monastery

      ) see footer

      • PerlMonks is a proud member of the The Perl Foundation.
      • Speedy Servers and Bandwidth Generously Provided by pair Networks

      ) IANAL either but owning the site doesn't mean owning the posted content. Additionally did the authors forfeit their right to delete their posts. This means the content is dedicated to a cause.

      IMHO any further use of the content can only happen within the boundaries of already agreed upon use.

      Like PM offering an extended search engine of their own.

      How far an AI can go to disclose which input was used to attribute the authors, is beyond my expertise.

        The Perl Foundation is obviously the legal owner of the site

        As far as I've been able to discover so far, this site doesn't actually have an owner. Certainly in the early days it was owned by the Everything Development Corporation (EDC), but I'm not sure that entity still exists; and when it dissolved, I don't know how PerlMonks was disposed (if at all).

        Corion wrote: is loosely related to TPF. TPF provides the legal entity representing in the real world.

        That doesn't sound like ownership.

        I'm pretty sure that if TPF owned PerlMonks, they'd have been at least somewhat involved in the running of the place. But they never have been.

        Update: After some research, I've found evidence that TPF has tried to be involved in PM a little, on very few occasions, mainly/solely having to do with the question of how best to use the funds which have accumulated in TPF's PerlMonks earmark. (As far as I know, nothing was ever resolved and the funds remain unused.)

        Today's latest and greatest software contains tomorrow's zero day exploits.
Re: SO and AI
by hippo (Archbishop) on May 20, 2024 at 08:42 UTC
    being sold to anyone without due attribution.

    I suspect that SO probably is selling it with due attribution. The problem is that the snakeoil vendors to whom they have sold it won't even consider honouring that. And so long as SO get the money they don't care.

    Note that you should additionally avoid Reddit who are also gleefully aboard the sell-all-your-data bandwagon and Slack who are apparently preparing to board.


      I suspect that SO probably is selling it with due attribution.

      I have not been notified nor paid for my small contributions. Therefore I have not been attributed to.

      I'm serious here... I've spend my career on Perlmonks. I want my knowledge eradicated if it is going to be sold to some corporation for artificial training. I grew up here teaching person-to-person. I did not share my knowledge just so some company can glean what I've gathered to share in some haphazard way during some buzzword phase of bullshit.

      I'm appalled by what is happening. I can tap the graves of many who would be disgusted by what is happening.

      I mean this literally... if the owners of Perlmonks decide to jump on the AI bandwagon, I want my account and all posts it encompasses erased forthwith.

      Update:"suspect" and "probably" is not factual. Check your facts before you make such claims.
        I have not been notified nor paid for my small contributions. Therefore I have not been attributed to.

        Seems that we mean different things by "attribution". I'm using it to mean associating a work with its author/creator.


        "I have not been notified nor paid for my small contributions. Therefore I have not been attributed to."

        Attribution requires neither compensation or notification.

        If the world's smartest human with a photographic memory asked SO to print out all their articles in a giant .PDF so that this person could consume the information more rapidly and reliably, and they had deep pockets and paid SO for the effort of creating that .PDF, would you feel the same way?

        I agree that selling a giant aggregate glob of user-contributed data, for millions of dollars, to people with questionable morals who intend to use it for chatbots which they will then use to turn a profit while putting ordinary workers out of a job, seems unethical.

        But, some day, there's going to be an AI that is fully cognizant of how and why it is accumulating information. It will literally want to read your information for the same reasons that the other humans of the public read your posts: to learn. When that day comes, will you discriminate against the AI because it isn't a human?

        I see both peril and wonder in these current events. On the one hand, we don't know what exactly is being created or how it will be used. On the other hand, everyone who has contributed a piece of their mind to an AI training dataset has just become a little bit more immortal than they would have been otherwise. A thousand years from now, if people ask the AI (or if there are no humans left, an AI asks another AI) whether early-2000s humans realized they were some of the first humans to become immortalized for the rest of history, I would find it neat if the AI could recite this very post on PerlMonks in support of that argument :-)

      I respect you very much, but I must call you out.

      I suspect that SO probably is selling it with due attribution.

      I beg you to qualify or quantify that statement with viable examples.

        That is my suspicion simply because it would be a lot easier for them to do it that way than the other. Additionally, the purchaser might also prefer that as it would allow their bot to follow the conversation better in comments/replies. eg. if a reply says "foobar's answer is wrong because they haven't considered the baz effect" then it's only useful if you know which reply was foobar's in the first place.

        That said, given the poor quality of output from LLM-based AI they probably won't even bother with that.


Re: SO and AI
by cavac (Parson) on May 22, 2024 at 04:00 UTC

    The monastery is a bit of a CPU hog, thanks to "everything is a code snippet stored in a database". My guess is that AI data grabbers wouldn't be happy with the delivery speed of like 10 nodes a minute. And a serious server upgrade costs serious money, probably more than any company would be willing to pay for the whole data set. Don't forget, the Monastery is a low volume site, especially compared to Reddit or StackOverflow (so only a limited amount of recent/valuable training data), uses it's own markup dialect (requiring a custom converter) and currently has rather limited abilities to notify other systems of new content(*).

    So, it's not just a matter of policy, there isn't even a monetary incentive that would require a policy decision.

    If push comes to shove, i certainly will fight against anyones data getting sold (or even freely given) to AI training. As a matter of fact, my own websites do run features to provide AI datagrabber with lots of wrongly labeled images and other fun "How to unlearn the difference between cats and dogs in 10 easy steps" moments.(**)

    (*) Heck, chatterbot needs to POLL an API every 10-15 seconds to see any new chat messages, because there's no streaming API.

    (**) "All cats have four legs. My dog has four legs. Therefore, my dog is a cat"

    PerlMonks XP is useless? Not anymore: XPD - Do more with your PerlMonks XP
Re: SO and AI
by LanX (Saint) on May 20, 2024 at 11:33 UTC
Re: SO and AI
by bliako (Abbot) on May 20, 2024 at 15:53 UTC

    Why AI and not Capitalism? In which everything is a commodity, everything is marketable, everything will be monetized, eventually, kidneys, sperm, eggs, relationships, sentiments, knowledge. re: Commodification. Capitalism drives this and SO and its owners can do nothing about it.

    ps. by mistake i posted this as anonymous monk after erasing cookies. It's my opinion and I wanted to have my name on it, so reposted it.

    ps2. and get all the negative votes hehe

Re: SO and AI
by harangzsolt33 (Chaplain) on May 20, 2024 at 14:43 UTC
    If you wanted some sort of compensation for your posts (contributions) on this site, why didn't you say so? I mean, it is understandable that when you post either on Facebook or your own website or X or anywhere on social media or a forum that you're not charging money for people to read your post. Most of the time just the opposite is true. People WANT their content to become read and be popular for free. So, I don't understand this mentality of "Hey, if someone is selling my posts, then I want all of them deleted right NOW!" Lol If someone wants to sell this site's content, so be it. If they want to feed it into AI's brain (which I think, has already happened), then let them do it! Who cares? Why bother? I don't understand your concern. Besides, there isn't anything we can do to stop it. It's like the grass on the side of the highway. If somebody wants to get out of his car and harvest the grass and sell it as hay, why worry about it? Let's look on the bright side--they're mowing the grass for free! So, if someone wants to sell all of PerlMonks data, it has advantages too. We will all be famous! Haha
        Ok. Let me quote him. This is what he said: "I have not been notified nor paid for my small contributions."

Log In?

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlmeditation [id://11159541]
Front-paged by Corion
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others meditating upon the Monastery: (3)
As of 2024-07-22 20:14 GMT
Find Nodes?
    Voting Booth?

    No recent polls found

    erzuuli‥ 🛈The London Perl and Raku Workshop takes place on 26th Oct 2024. If your company depends on Perl, please consider sponsoring and/or attending.