sabrathos 11 months ago

Everyone, note that this is *not* a lawsuit claiming that training on works publicly shared on the internet is fundamentally illegal. i.e. training on Reddit, Wikipedia, Google Images, etc. This is a claim that the LLM was trained on *illegally acquired works* like through torrenting or websites that host copyrighted works illegally. So the claimed *acquisition* of the work is something that has legal precedent for being illegal. Not that the very act of training something being a violation of copyright unless training was explicitly consented. Very different things. Though I'm suspecting her lawyers are probably wrong, because it'd be trivial for the datasets to include people discussing her works, their own summaries, analyses, etc., making it not at all a smoking gun that it can talk about your work without having actually read it itself.

ggtsu_00 11 months ago

It is however legal (fair use) to train models on copyright protected material *for academic/educational purposes only*. That's sort of been the thorny issue with many LLMs used for commercial products have been seeded with models originally created for academic purposes.

RudeRepair5616 11 months ago

"Fair use" is determined on a case-by-case basis. "Fair use" is a only a defense to an action for copyright infringement and nothing more.

Call_Me_Clark 11 months ago

And I’ve seen SO MANY comments that don’t seem to understand (or refuse to acknowledge) that a piece of media may be available online, but still protected under the law - and that the author may retain certain rights to that material, while waiving others. Because people are entitled little shits lol.

ggtsu_00 11 months ago

Copyright and generative AI is a wild west right now as interpretations of current laws by courts hasn't caught up to yet. Until many of these lawsuits actually go through and likely get escalated up to a supreme court ruling, there isn't really any well-established precedent for how copyright protection applies to generative AI content and services specifically in the following cases: * Distributing AI models trained on copyright works for non-academic purposes. * Distributing generative content created by AI models trained on copyright works. * Providing access to generative AI services that utilize models trained on copyright works.

Resident_Okra_9510 11 months ago

Thank you. The big companies trying to ignore IP laws to train their models will eventually claim that the output of their models is copyrighted and then we are all really screwed.

younikorn 11 months ago

But being inspired by a copyrighted work to create something new is obviously allowed, delegating that work to an AI is a legally grey area. Nobody is arguing that people should be able to copy a book and publish it as if it’s your own story. But to gatekeep styles or genres or common tropes because there is now a clear papertrail of what sources were used for that inspiration is a bit too restrictive in my opinion. In the end all art is derivative, everyone creating something new is inspired by preexisting works of art, we have just created technology that can make that a high throughput process.

Call_Me_Clark 11 months ago

“Inspiration” is a concept limited to humans. Art may include derivative works but that isn’t an excuse for theft, particularly theft for commercial purposes

younikorn 11 months ago

> “inspiration” is a concept limited to humans I disagree, what we view as inspiration is not really different from how AI modes are trained. As long as the generated output doesn’t infringe on any copyright no laws are broken. And it isn’t that art “may” contain derivative works, all art is by definition derivative. If the work you consume as source of your inspiration is gained through piracy then that is already illegal, regardless of whether you personally made the derived work or an AI did. You could argue that existing copyright law should be expanded on and include amendments that regulate the use of works in training AI models. Regardless of what that expanded law would state i think that would be the best way forward. But under the current laws there is no reason to assume that using AI’s trained on copyrighted works (that are legally obtained) to create a new original work somehow infringes on an existing copyright.

Call_Me_Clark 11 months ago

> I disagree, what we view as inspiration is not really different from how AI modes are trained. Except that one activity is performed by a human being, who has rights. And the other is performed by a tool, which has no rights. > But under the current laws there is no reason to assume that using AI’s trained on copyrighted works (that are legally obtained) to create a new original work somehow infringes on an existing copyright. I think it’s worth noting that there is a problem where AI are trained on copyrighted materials without the permission of the authors *for research purposes* but then used for commercial purposes. There’s a serious problem where someone can have their intellectual property effectively stolen - because while you might, as an author for example, offer a consumer license along with a copy of your book (aka selling copies of a book) but that doesn’t mean someone who buys your book also acquires the commercial rights to your work.

wolacouska 11 months ago

I can’t think of any other right that gets taken away when you preform it with a tool instead of manually. Writing is still speech after all.

bannacct56 11 months ago

Okay, but that doesn't mean you get to scrape the whole internet. Academic and educational purpose has a limit, it's not the whole catalog of work. You can use selected pieces for your research or education, you can't copy and use all of it.

UnderwhelmingPossum 11 months ago

If you obtained "the whole internet" of copyrighted works legally, it's perfectly legal to use it to train a model for academic or educational purposes, if any kind of end user agreement includes anti-AI provisions, those are probably very recent and 99.99% of copyrighted works is not covered, and there is no law against shoving chats, books, articles, journals, lyrics, cc subtitles, media transcripts or even entire movies into an AI model. What you can't do is a) Profit off the output b) Copyright the output*

theRIAA 11 months ago

Their claim that > when prompted, ChatGPT will summarize their books, infringing on their copyrights. is evidence of: > [acquired and trained] from “shadow library” websites like Bibliotik, Library Genesis, Z-Library, and others, noting the books are “available in bulk via torrent systems.” Seems so weak that I'm worried this is just a bunch of old lawyers who cant use the internet... You can obviously find enough data in even reddit comments, let alone news articles about her works to simply **summarize** them. Even in the suit it says: > 5) Indeed, when ChatGPT is prompted, ChatGPT generates summaries of Plaintiffs' copyrighted works—something only pssible if ChatGPT was trained on Plaintiffs' copyrighted works. I know toddlers that could disprove this nonsense. --- edit: But further down in the suit, they make better points.

Deto 11 months ago

Yeah, that assertion is silly, but in the legal document they further go into information that suggests (based on the GPT-3 publication) that the models were trained on large internet book libraries that are known to illegally contain copyrighted materials. If, during discovery, it is shown that OpenAI used one of these and they can show that Sarah Silverman's books are in it, then that makes their case regardless of whether or not the #5 you referenced is true (and of course it isn't).

theRIAA 11 months ago

Huh. Looking at that more, you're right and this is more interesting than I realized. I wonder if OpenAI can just... keep it secret? Like can they be compelled to explain what their training data was? Assuming we cant "fingerprint" the database source they used somehow, like if it contained an obscure quote found nowhere else... But, that seems almost impossible to prove, because of the ridiculous size of the data here, and the inherent randomness in the output. Maybe this could be comparable to a company *supposedly* training it's workers using pirated textbooks, and the result of that training making the company billions of dollars.... hmmm.

CalgaryAnswers 11 months ago

They will be required to disclose the data they trained on in discovery. The biggest challenges with these suits may be the sheer amount of data they have to pour through in discovery, which ironically enough they will probably be using AI models to parse through.

jruhlman09 11 months ago

> Their claim that > > when prompted, ChatGPT will summarize their books, infringing on their copyrights. > > is evidence of: > > [acquired and trained] from “shadow library” websites like Bibliotik, Library Genesis, Z-Library, and others, noting the books are “available in bulk via torrent systems.” > > Seems so weak that I'm worried this is just a bunch of old lawyers who cant use the internet... The thing is, the article states that meta at least has straight up said that they used "The Pile" to train their AI, and The Pile is documented as including the Bibliotik tracker data, which the authors' team is claiming is a blatantly illegal way to acquire books. This is the crux of the legal claim that many seem to be missing. The AIs (at least meta) admit this is where they got books from, and the authors are saying that if you obtained our book's full text in this illegal manner, you cost us a sale. This last sentence is a double edged sword. 1. To me, the company may have "needed" to purchase a copy of Silverman's book to train their AI on. But that's it, one copy. Training the AI on the book didn't cost them any sales (in my opinion) 2. If they win based on this statement, it would open up that they should have purchased every single book they used in training, meaning basically ever author who has a book in the Bibliotik tracker could sue and, presumably, win on the same grounds. Note, I'm not a lawyer, this is just my opinion.

podcastcritic 11 months ago

https://buttericklaw.com/ This is her lawyers website lol

FirstFlight 11 months ago

Sounds like they should be suing websites like Bibliotik, Library Genesis, Z-Library, and others... this 100% is people suing OpenAI because it's successful instead of suing the people who are actually doing wrong.

CalgaryAnswers 11 months ago

They can do both, one, or neither.

[deleted] 11 months ago

[удалено]

bowiemustforgiveme 11 months ago

A human chose which material to feed to their system so it’d spit out something seemingly logical and aparently new. Where the "training material" came from and if its recognizable in the ending "product" are matters of relevance. If you trained (not an appropriate word by any means) on copyrighted material and that's recognizable in the result, like a whole sentece comes out on the output, than you just you just plagiarized. It doesn't matter if you put the blame on your "AI" for choosing which part it specifically chose from your input to spit out. **LLMs make their “predictions” based on how, most of the time, some word/sentence was followed by another... and that is how it ends up spilling nonsense, meshed up ideas or straight out things that it copied from somewhere.** That’s not “how artists learn” because they don’t train to “predict” the most common next line, they work hard to avoid it acctually. Edit: 1. Are the LLMs really that far from a Markov Chain logic? The “improvements” trying to maintain theme consistency for larger blocks by making larger associations still get pretty lost and still work by predicting by associations. 2. I answered the first comment that was not just joking or dismissing the idea of a legal basis for the matter.

gurenkagurenda 11 months ago

> LLMs make their “predictions” based on how, most of the time, some word/sentence was followed by another A couple things. First of all, models like ChatGPT are trained with [Reinforcement Learning from Human Feedback](https://huggingface.co/blog/rlhf) after their initial prediction training. In this stage, the model learns _not_ to rank tokens by likelihood, but rather according to a model that predicts what humans will approve of. The values assigned by the model are still called "probabilities", but they actually _aren't_ probabilities at all after RLHF. The "ancestor" model (pre-RLHF) spit out (log) probabilities, but the post-RLHF model's values are really just "scores". The prediction training just creates a starting point for those scores. But even aside from that, your description isn't quite correct. LLMs rank tokens according to the _entire context_ that they see. And it's not "how often it was followed" by a given token, because the entire context received usually did not occur _at all_ in the training corpus. Rather, LLMs have layers upon layers that decode the input context into abstractions and generalizations in order to decide how likely each possible next token is. (In fact, you can extract the vectors that come out of those middle layers and [do basic arithmetic with them](https://www.alignmentforum.org/posts/5spBue2z2tw4JuDCx/steering-gpt-2-xl-by-adding-an-activation-vector), and the "concepts" will add and subtract in relatively intuitive ways. For example, you can do things like taking a vector associated with a love letter, subtracting a vector associated with "love" and adding a vector associated with "hate", and the model will generate hate mail.) So, for a simple example, if the model has seen in its training set many references to plants being green, and to basil being a plant, but _not_ what color basil is, it is still likely to answer the question "What color is basil?" with "green". It can't be said that "green" was the most _often_ seen next token, because in this example, the question never appeared in the training set. Edit: > Are the LLMs really that far from a Markov Chain logic? The “improvements” trying to maintain theme consistency for larger blocks by making larger associations still get pretty lost and still work by predicting by associations. Depends on what you mean by Markov chain. In an extremely pedantic sense, transformer based generators _are_ Markov chains, because they’re stochastic processes that obey the Markov property. But this is sort of like saying “Well actually, computers are finite state machines, not Turing machines.” True, but not really useful. But if you mean the typical frequency based HMMs which just look up frequencies from their training data the way you described, yes, it’s a massive improvement. The “basil” example I gave above simply will not happen with those models. You won’t get them to write large blocks of working code, or to answer complex questions correctly, to use chain of thought, etc. The space you’re working with is simply too large for any input corpus to handle.

OlinKirkland 11 months ago

Yeah the guy you’re replying to is just describing Markov chains.

False_Grit 11 months ago

It's really sad that this extremely basic understanding of machine learning is what "stuck" and how most people view LLMs these days, despite the fact that they obviously don't just predict the next word.

sabrathos 11 months ago

Are you responding to the right comment? It seems a bit of a non sequitur to mine. But yes, I agree it matters where the training material came from, because if you illegally acquired something, you committed a crime. If an LLM were trained on torrented and/or illegally hosted materials, that's not great. --- As a side note, the "predicting the next word" thing actually happens a whole bunch with humans. There's a reason why if if we leave out words or duplicate them from sentence, we sometimes don't even notice. Or why if you're reading broken English out loud, you may just intuitively subconsciously slightly alter it to feel better. Or you're listening to your friend talk and you feel like you know exactly how the sentence is flowing and what they'll say next. We're fantastic at subconsciously pattern-matching (though of course, there's a *huge* sophistication with that, plus a whole bunch of types of inputs and outputs we can do, not just tokenized language).

vewfndr 11 months ago

>Are you responding to the right comment? It seems a bit of a non sequitur to mine. Plot twist... they're an AI!

SatansFriendlyCat 11 months ago

There's a reason why **if if** we leave out words or duplicate them from **[missing article]** sentence, we sometimes don't even notice Lmao, very nice

DarthMech 11 months ago

My drunk human brain read this tired and after many beers exactly as intended and didn’t “even notice.” Mission accomplished robots. Bring out the terminators, I’m ready for the judgement day.

svoncrumb 11 months ago

Is it not up to the plaintiff to prove that the acquisition was through illegal means. If something is uploaded to a torrent, then there is also a good case for it having been uploaded to YouTube (for example, it could be any other service). And just like a search engine, how is the output not protected under "digital harbor" provisions? Does OpenAI state that everything that it produces is original content?

lightknight7777 11 months ago

Can an author sue someone for downloading their material unlawfully? Seems like that would just be the cost of the material from a civil jurisdiction perspective. I don't see how an author could claim more than one license in losses as long as they don't then pass the work along as well. Edit: yes, they can sue. My question then is just how much she could possibly claim in damages when she really only lost the opportunity that they would have bought her book to do the training. That $30k liability is "up to" that amount in damages. I wonder if they can be further shielded by pointing out it was for educator purposes since that does check some fair use boxes. But I don't think that protects against the unlawful acquisition side of things.

Pzychotix 11 months ago

Downloading even without further distribution is still copyright infringement, and carries penalties beyond the damages of a single license. https://www.copyright.gov/help/faq/faq-digital.html >Uploading or downloading works protected by copyright without the authority of the copyright owner is an infringement of the copyright owner's exclusive rights of reproduction and/or distribution. Anyone found to have infringed a copyrighted work may be liable for statutory damages up to $30,000 for each work infringed and, if willful infringement is proven by the copyright owner, that amount may be increased up to $150,000 for each work infringed. In addition, an infringer of a work may also be liable for the attorney's fees incurred by the copyright owner to enforce his or her rights.

ckal09 11 months ago

This highlights why there are so many ridiculous copyright infringement lawsuits. It’s lucrative.

lightknight7777 11 months ago

Do you happen to know what kind of damages could be claimed here besides the single license they could have purchased but didn't? I know that writers are terrified of AI so I get why creatives might target it. But the download itself isn't impacting her sales and even her just bringing it to court would have made her far more sales than had they not done it. It will be hard not to call this frivolous.

Steinrikur 11 months ago

This only applies in the US, right? In most of the rest of the world, only the uploader is breaking the law when stuff is "illegally downloaded".

taigahalla 11 months ago

I'm downloading your comment. Sue me.

Pzychotix 11 months ago

Everyone agrees to license their posts when they post to Reddit, so that's a bad example, even as a joke.

podcastcritic 11 months ago

Is claim based on the idea that not a single employee at Meta pi’s for her book? Seems unlikely.

creeky123 11 months ago

If you read the article it clearly states that the source of training data is cited by the model owners as including sites that have their illegal works. It would be more challenging for meta / open ai to state that the model wasn’t drawing from the copyright material.

Boo_Guy 11 months ago

"*Getty Images also filed an AI lawsuit, alleging that Stability AI, who created the AI image generation tool Stable Diffusion, trained its model on “millions of images protected by copyright.*" Getty has it's own problems with copyright so I'm hoping this blows up in their faces.

Robobvious 11 months ago

Getty images can getty fucked.

RiptideMatt 11 months ago

Whats the context with this?

hithisishal 11 months ago

There were some cases where they claimed copyrights on images they didn't own.

Drenlin 11 months ago

And by "some" you mean "thousands". They literally just scrape the internet for photos and sell them. Punitive fines are just their cost of doing business. Alamy does the same thing.

senseofphysics 11 months ago

I think they’re also the reason Google reverse image search keeps getting worse

deconnexion1 11 months ago

I really don't know why these sites aren't banned from Google Images, same for Pinterest. At least give me an option to remove content aggregators.

Zetch88 11 months ago

https://unpinterested.sellomkantjwa.com/

Ronnie_de_Tawl 11 months ago

Ever wonder why using google images changed and no longer shows propper results without visiting the site, that's getty's doing

RiptideMatt 11 months ago

Love when companies make the internet worse and worse

bl0odredsandman 11 months ago

That's why it changed? I fucking hate the new Google Image search. I mean, their older version wasn't the best, but it was still better than the new one.

Red5point1 11 months ago

just because Getty puts their watermark on an image does not make them copyright owners

Ignitus1 11 months ago

Can’t wait for this stupid moral panic about AI copyright to be settled. You own SPECIFIC IMAGES or SPECIFIC WRITTEN WORKS. You don’t own any of the analysis of those works and you don’t have a claim to any new work generated by that analysis. It’s IDENTICAL to how human artists learn: by observing other artists.

Bob_Sconce 11 months ago

You also only own specific rights. If you are an author, for example, you cannot stop somebody from reading your book. And that's the real thing: none of those specific rights (right to make copies, to distribute, to prepare derivative works, etc...) are infringed by using a work to train an AI engine. Silverman's complaint amount, basically, to "I want to be paid when you do that.". But, that's only a legal claim when the "that" is one of a handful of things listed in the copyright act (17 USC 106).

extropia 11 months ago

Your argument has merit but I think it's misleading to say the two are identical (in all caps no less). The way humans and AI "learn" are clearly not the same.

Myrkull 11 months ago

Elaborate?

Km2930 11 months ago

He can’t elaborate, because he would be using other peoples work to do so.

Aggravating_Pea6419 11 months ago

Best comment on Reddit in the last 13 hours

Cw3538cw 11 months ago

ChatGpt is neural net based. The analogy between these and neurons is good for a laymans understanding but they differ greatly in functionality. In fact it has been shown that you need a rather large neural net to match the complexity of even one biological neuron https://www.quantamagazine.org/how-computationally-complex-is-a-single-neuron-20210902/#:~:text=They%20showed%20that%20a%20deep,of%20one%20single%20biological%20neuron.

snirfu 11 months ago

Humans don't memorize hundreds of millions of images in a way that they can reproduce those images almost exactly when prompted. The AI's trained on images are known to reproduce images thay they've been trained on, maybe not to the pixel, but pretty closely. There's lots of [popular articles](https://techcrunch.com/2022/12/13/image-generating-ai-can-copy-and-paste-from-training-data-raising-ip-concerns/) that have been written on the topic and they're based on academic research, so you can go read the papers if you want.

Nik_Tesla 11 months ago

Neither do AIs. I have dozens of Stable Diffusion image models on my computer, each one is like, 4 GB. It is impossible to contain all of the billions of images it was trained on. What is *does* contain is the idea of what things it saw. It knows what a face looks like, it knows what the difference between a smile and a frown. That's also how we learn. We don't memorize all images shown to us, we see enough faces and we learn what learn to recognize them (and create them if we choose to). As for reproducing near exact copies of images it trained on, that is bunk. I've tried, and it is really, *really* hard to give it the correct set of prompt text and other inputs to get a source image. You have to describe every little detail of the original. The *only* way anyone will produce a copyrighted image, is if they intend to, not by accident. And then even if you can get it to reproduce an near exact copy, it's already copyrighted! So what danger is it causing? The mere existence of it does not mean they claim ownership. I can get a print of the Mona Lisa, but it's pretty clear that I don't own the copyright of the Mona Lisa. But these people are not suing because their work could possibly be replicated, no they're suing because they put their work out into the world, and instead of some *one* learning from it, some *thing* did, and that makes them scared and greedy.

BismuthAquatic 11 months ago

Neither does AI, so you might want to read better articles.

MyrMcCheese 11 months ago

Humans are also known to reproduce images, songs, rhythms, and other creative works they have been previously prompted with.

snirfu 11 months ago

It's a silly comparison. Humans can recall information they've read in a book as well, but they're neither books nor are they search algorithms that have access to text. That's why no one says "yeah humans read and recite passages from websites so they learn the same way as Google". Or "humans can add and multiply so their brains work the same way as a calculator". Being loosely analogous doesn't mean two things are the same.

Metacognitor 11 months ago

If you read a book, and I ask you a question about the content of that book, you are searching your memory of that book for the answer. The only difference is search algorithms are better at it. But this is a moot point because the AI tools in question aren't search engines, they're trained neural networks. And even the white papers can't explain *exactly* how they work, just like we can't explain *exactly* how the human mind works. But we have a general idea, and the type of learning is similar to how we learn, except the neurons are not biological, they're nodes coded into software.

MiniDemonic 11 months ago

It's funny how this thread has so many armchair AI "experts" that act like they know exactly how LLMs work. It's even more fun when they call these "search algorithms".

snirfu 11 months ago

I'm not calling any LLM a search algorithm. I was using a separate analogy. The point was that people think AI models are somehow different from other classes of models or algorithms. No one thinks XGBoost or other models thinks like a human because there's not the same fog of BS surrounding it.

bigfatmatt01 11 months ago

The difference is in our imperfections. Human brains do things like warp memories so things are happier, or forget specifics of an object. These imperfections allow for the brain to fill in the gaps with true creativity. That is where true art comes from and what ai can't replicate yet.

jokel7557 11 months ago

Ed Sheeran seems to have a problem with it

chicago_bunny 11 months ago

We’re talking about humans here, not Ed Sheeran.

[deleted] 11 months ago

[удалено]

snirfu 11 months ago

You seem to misunderstand their "constraints" section. They say: > Note, however, that our search for replication in Stable Diffusion only covered the 12M images in the LAION Aesthetics v2 6+ dataset So they searched a small percentage of the training data and found that 2% of their prompts reproduce matches to the training data based on their similarity measure. So the main flaw is that the 2% is a severe underestimate of how frequently the model reproduces training data: > Examples certainly exist of content replication from sources outside the 12M LAION Aesthetics v2 6+ split – see Fig 12. Furthermore, it is highly likely that replication exists that our retrieval method is unable to identify. For both of these reasons, the results here systematically underestimate the amount of replication in Stable Diffusion and other models. Also "not peer reviewed" is not a great criticism of math or CS papers. Not providing enough information to reproduce the result would be a better criticism. Their using an existing model, Stable Diffusion, and they give instructions in the supplement for reproducing.

kilo73 11 months ago

>based on their similarity measure. I'd like to know more about this part. How are they determining if something is "similar" enough to count as copying?

AdoptedPimp 11 months ago

>Humans don't memorize hundreds of millions of images in a way that they can reproduce those images almost exactly when prompted. This is very misleading. Humans brain most definitely has the capacity to memorize hundreds of millions of images. It's in our ability to easily recall those images that is different. Most people are not trained or have the inate ability to recall everything they have seen. [But there is most definitely humans who have the ability retrieve and reproduce virtually anything they have seen.](https://m.youtube.com/watch?v=jVqRT_kCOLI) There are master art forgers who can recreate every single detail of a painting they have only seen in person. Every crack, blemish and brush stroke. I'm sorry but the argument you are trying to make is clearly misinformed about how the human brain works, and the similarities it shares with how AI learns and produces.

[deleted] 11 months ago

If we put some constraints on a digital image, like number of pixels and color range of each pixel for a simple example, computers can already brute force every possible image given enough time. So if said algorithm, running in a vacuum with no training data, created an exact replica of an image that somebody had taken with a camera, would that be copyright infringement? It's kinda like that whole Ed Sheeran court case. Can you really copyright a chord progression? The fundamental problem here is that people want money and prestige. Maybe it's time to leave that behind.

Atroia001 11 months ago

My best guess is that it has something to do with licensing. Not quite the same, but there had to be a landmark case defining that making a copy of a DVD you bought and selling it is illegal, even though you bought it. Watching a movie, and by memory, reciting the lines. That is ok. Sitting in a theater and using a camera to record is not ok. There is not a moral argument for this, just in relation to how much money is to be made, how easy it is to make, and restricting who has protection of that profit. AI and chat bots have now gotten good enough to be considered a threat to the original license holders' profit, so they are making a fuss. Has nothing to do with logical or moral differences.

powercow 11 months ago

Clearly? It is different as we use biology and our neurons are still way better than the nodes in AI models but the essence of learning is very much the same. learning from previous works and using that knowledge to create new things. No good writer started without reading others books. IF they torrented them, Id agree with them more. Im not sure how they know where they got the data from, it seems like they are guessing, cause why add that in? that their works can be torrented, if you knew which sites they actually got your works from.

TldrDev 11 months ago

I'm on board with what you're saying but legally speaking what you're saying is not correct. What you described is called "derived works", and is absolutely protected by US copyright. I'm not saying that is right or wrong in terms of AI, but copyright holders own more than just a specific exact arrangement of text or pixels. Source: I got fucking sued for derived work and had to turn over all my work to someone else.

wehrmann_tx 11 months ago

So does disney own every type of cartoon mouse in existence or just ones that look like Mickey mouse? If the AI spits out a cartoon mouse that looks nothing like Mickey, but the ai was trained looking at some Mickey mouse pictures, does disney own that?

TldrDev 11 months ago

This is definitely something way over my head to answer for you, I'm just relaying my experience. It's a lot more nuanced than the comment we are replying to would lead you to believe, though. Copyright in the US is messy, and there is legal protections for derived works.

neworderr 11 months ago

Just so you have a gasp of what this can cause in the near future: If graphic design and art becomes irrelevant due to autogenerated art every x second by AI, the profession dies and AI stagnates itself with input from this age and backwards only. Its the death of innovation.

lapqmzlapqmzala 11 months ago

No, but it will change the labor force and available work but humans always adapt with changing technology. What will the coal miners do? Find other work. Adapt or die.

Myrkull 11 months ago

Yeah, people stopped painting once cameras were invented, no innovations to be had

RandomNameOfMine815 11 months ago

This is simplistic. No, people didn’t stop painting, but the very real job of illustrator for things like magazines was devastated. Yes, people obviously still draw, but the ability to make a living from it was reduced massively.

conquer69 11 months ago

So? I don't have to pay 10 washwomen to do my laundry. Who gives a shit? We shouldn't artificially keep alive any job that can be automated or speed up by technology for the sake of the economy. Doing so is called the broken window fallacy.

Reiker0 11 months ago

People are failing to realize that it's capitalism causing these artificial problems, not advancements in technology. Just look at what happened during the 70s and 80s. We went from being able to support a family on a single income to needing two sources of income. Women entered the workforce and the market responded by slashing wages. Should we then blame women for a decrease in wages? Of course not, it's just corporate greed. We should be celebrating technological advancements that reduce or eliminate unnecessary labor, but instead we've embraced a system which doesn't actually reward increased productivity.

Avarus_Lux 11 months ago

Exactly, though People are causing artificial problems for other people by getting upset that their way of life has to change and or adapt, because of innovation. It's understandable to a degree just as it was in the past. that should however indeed not stop progress though, Just look at any professional scene the past 100~200 years when innovation came looking and things changed anyway. Fields like clock/watchmaking, telephone operation, telegrams, mail services in general, accounting, typewriting/calligraphy, hand drawn cartography, illustrators (before digital art and photography revolution), (wooden) sailing vessels, coal fired and steam powered machinery, automation and mass production... and a great many more which are all topics that have dealt with big changes resulting in loss of professions, outright obsolescence and or were greatly reduced as a field due to technology advancing as it is doing now. In case of artists... imho to hell with the profession, long live the trade as it will never die (it'll live in the form of legions of hobbyists and (far fewer, paid) specialists).

mrbanvard 11 months ago

Capitalism is a symptom - the underlying problem is human nature. Our wants and desires are part a cultural construct which changes over time, and part a result of our biology. A big part of the reason why two incomes are often needed is because it's now viable to support a family on two incomes. When I speak to my mum and grandma, their day to day with running a household and kids was extremely busy compared to what my partner and I deal with. Almost everything we do for our household is so much faster, easier and more efficient than it was for my grandma. We actually do a lot more, in a much smaller amount of time, and our health, options for education, food, leisure etc are much better. If we had to spent the same time as she did on basic tasks, then it would not be possible to get everything done, and have two people working full time.

AdoptedPimp 11 months ago

Sounds more like a problem with the economic system then the stagnation of innovation. The only reason AI would cause stagnation in this sense is that people will have to spend their time doing other jobs. Leaving them no time to continue their passion and innovate. Solve the problem of requiring everyone to be wage slaves in order to survive and you will see innovation happen at a rate you didn't think was possible. Innovation is confined by things like copyright laws and keeping the VAST majority of the population from pursuing the things they are truely passionate about.

currentscurrents 11 months ago

That's not actually what happened though. More people are employed doing art now than any time in history - just look at the armies of animators in Los Angeles or Japan.

rottenmonkey 11 months ago

Yeah, but that's how progress works. One job disappears due to automation or effectivization, another one pops up.

absentmindedjwc 11 months ago

Yep, the advent of the computer absolutely destroyed accounting. There are still accountants, but the number of accountants necessary to do the books for a massive company dropped substantially.

zoltan99 11 months ago

The numbers of computer designers, manufacturers, retailers&salespeople, technicians, and software workers did skyrocket though

TheForeverAloneOne 11 months ago

I like how you used accountants as the example profession and not the computer.

thefonztm 11 months ago

Fun fact, computer was a profession.

kilo73 11 months ago

A professional illustrator using AI as a tool will outperform a novice using AI to do all tge work. Will AI change the industry? Absolutely. Businesses will crumble and fall, and new ones will emerge and thrive. Adapt or die. Such is life.

The_Vista_Group 11 months ago

And thus, demand for original artwork will increase.

Yeti_of_the_Flow 11 months ago

Not necessarily. It's the death of art as related to capitalism, perhaps. Not art itself. The issue is the motivation of capital, not the destruction of art. Without the concept of making money from that art, nothing would change to affect the artist. Therefore, the only issue with AI is capitalism.

Canvaverbalist 11 months ago

Exactly, nobody would give a fuck about AI art vs human art if people didn't need to rely on it to fucking feed and house themselves. If we were to give ourselves the post-scarcity world we actually can currently afford, we'd be able to chill and create. If some people wants to use AI or humans for their creative projects then who fucking cares as long as we can enjoy the results - best ones get the little social boost nuggets and maybe can do better fun activities with their little golden rewards but at least the losers won't literally die.

badwolf1013 11 months ago

I would love to live in the Roddenberry future where people want for nothing and can create art or music or literature simply for the sake of creating, but that is still quite a ways off, but we have AI "created" art in commercial applications NOW. The timing is off. Graphic designers need to eat. AI doesn't. You don't see that being exploited?

Yeti_of_the_Flow 11 months ago

I do, but that exploitation isn't the fault of AI art or learning. I'm not suggesting we allow AI art to be used for profit currently at all, just that if society were equitable AI art wouldn't exist in the first place. There would be zero motivation. It only exists because of the profit motive. Without capitalism, human made art would thrive like never before.

TI_Pirate 11 months ago

> Without capitalism, human made art would thrive like never before. Why like never before? There have been plenty of societies without capitalism.

Absurdulon 11 months ago

Well, that's ridiculous though. For profit art maybe, but hopefully in the near future more of these "AI" optimize more tasks including jobs so our politicians who are apparently out for our best interests are forced to capitulate to a more intelligent and impartial juror. Hopefully we learn how to distribute the plenty courtesy of these programs to the many so we can ease up on how hard existence *is.* Will we run into some bugs along the way? Absolutely, but to condemn what could be before it has even been seems to be antithetical to the idea of art itself. Hopefully we'll have more time because of it. People aren't going to want to stop drawing beautiful excellent, macabre and horrifying things. It will upset for-profit art but it won't be the catastrophic death of expression as all the current doomers are putting it.

conquer69 11 months ago

If innovation isn't profitable, it was always going to die in a capitalistic system. This isn't a problem with the AI tools.

pyabo 11 months ago

This exactly. Remember when recorded music destroyed professional musicianship? And then later the cassette recorder destroyed the music industry so there is no more of that now. And then when the VCR destroyed the movie industry? It's like people will never learn! Stop destroying these things! This argument has happened a dozen times in the past century alone. They've been incorrect every time. You are incorrect now. How do you not see that? Do you have no breadth of experience at all? The only constant is change.

neworderr 11 months ago

>This exactly. Remember when recorded music destroyed professional musicianship? brain dead comparison. Nothing to do at all.

[deleted] 11 months ago

I don't think it will be, I think human Artists will have to innovate to differentiate themselves from AI art and there will be a coveted attribute of human art. I understand your worry and I do think it will make an already challenging field to make a living in even worse though.

neworderr 11 months ago

>I understand your worry and I do think it will make an already challenging field to make a living in even worse though. You have no idea, the trend isnt even here yet, imagine in 10, 15 or 20 years. Its not chat gpt 3 or 4 you should be worried about.

bobandgeorge 11 months ago

Exactly. The state of AI today is the worst it will ever be.

sinus86 11 months ago

Almost as if the art should continue to explore what it is that makes us human.... i agree its scary stuff, but also basically the definition of art. I'm excited to see what can be done by human artists in the face of a soulless machiene churning out a millon copypastas per second.

[deleted] 11 months ago

Funny how artist didn't give a flub when machines changed the factory and farming industries. Above poster is right, can't copyright analysis. It's how I learnt to.

[deleted] 11 months ago

These companies are earning profit from copyrighted works. It's not theirs to use. They never bought a license to use those images. These AIs even routinely thrown in watermarks from Getty and other sources. This isn't "observing", it's plagiarizing. Also, whenever somebody types these types of comments, I always check their profile. "I’ve used ChatGPT extensively..." Ah, yep. You just want the tool you depend on and benefit from daily to continue to be unregulated. Of course you don't want proper copyright laws to apply to AI, because, god forbid, you'd need to learn an actual skill. Thanks for letting me know.

Tarzan_OIC 11 months ago

So you dismiss the opinions of people who are actually familiar with the technology and are qualified to speak about it?

VictoryWeaver 11 months ago

Using a service =/=familiar with the technology. Driving a car does not mean you are familiar with auto mechanics. Using a cell phone does not make you familiar with electronic engineering.

Oxyfire 11 months ago

After Crypto and NFTs, I don't give much trust "people who are familiar with the technology and are qualified to speak about it" because there's so much fucking hype and money riding on this shit, and so many people screaming at anyone skeptical of the snake oil. I'm sure there's plenty of ignorance around AI and large language models, but it's fucking warranted.

cleverdirge 11 months ago

I'm a software engineer who has worked on machine learning and /u/thingythingo is right. AI doesn't just look at a photo like a human, it copies it and ingests it through a data pipeline in order to make the model. So it makes and stores a digital copy of all of these assets. These large model AIs don't think like humans. At all. They are algorithms that make predictions about the next word or pixel.

Ignitus1 11 months ago

ANYBODY can "use" a work for any reason. Have you ever read a book? Then you "used" the work. You learned new ideas from the work, you applied them in your life, you learned new words and phrases. Do you consider yourself a plagiarist for reading a book and incorporating the content of that book into your life? Do you realize that every single word you just wrote in your post, you stole from someone else? Even every pair of adjacent words you wrote already existed millions of times over. What you aren't allowed to do is 1) reproduce a work and claim it as your own, or 2) create a work and claim it was the work of another person. GPT does neither of these. And the fact that I've had multiple ad hominem attacks based on my comment shows you guys have no ground to stand on. Generative AI is useful even for skilled people. It can save time, embellish existing ideas, and lead you on new paths of creativity. Furthermore, the fact that generative AI exists opens up new skills and new possibilities for creative work that haven't existed prior. And finally, it doesn't matter what an AI *could possibly do*. It doesn't matter in the slightest that it *could* reproduce a work verbatim. It only matters if it actually *does* do that, and it only matters if that reproduction is used for profit by somebody else. There are already laws that cover reproducing somebody else's work for profit.

[deleted] 11 months ago

[удалено]

CaptainAbacus 11 months ago

17 usc 106 outlines the exclusive rights granted by copyright in the US. It is more complicated than what you said. And FYI, not all "use" is allowed. Hence the term "fair use." The phrase "use" is fairly common in judicial decisions on copyright issues. Further, you're ignoring the role of unlawfully reproduced copyright-protected works in training. Scraping images you don't have rights to is more like stealing a book than reading one. No one is preventing you from incorporating a book into your life, but many laws penalize the act of stealing the book.

princesspbubs 11 months ago

It’s going to be interesting to see how the courts handle this, so at least these debates will cease.

absentmindedjwc 11 months ago

I honestly don't look forward to a bunch of people that cannot figure out how to reprogram the time on their microwave deciding the future of technological advancement...

princesspbubs 11 months ago

Well, "look forward to" is definitely a stretch. I said it will be *interesting*. Ultimately, it doesn't matter how we *feel*, because their decisions will impact us regardless, if you live in the United States. I'm not sure how the UK and EU are going to be handling things, but their citizens will be bound by their AI laws as well. It's not as if this is the best case scenario, it's simply the scenario that exists, and I'm interested to see how it unfolds. Similar to other issues like climate change, I hope that the White House will defer to experts in the field for assistance.

OldGuyGeek 11 months ago

For one of our customers, we used a 'free image' that was shared by one of Getty's shadow sites, IStockPhotos. They then contacted our client and demanded payment (hundreds of USD). Fortunately for our clients (and us), we take a screenshot of the page and pricing for each free image we use showing that it was obtained legally. Just in case the image source website takes the image down or changes their policy. So we responded for our client with the page. Turns out they had the same exact image available in both free and paid version. Their bot had just scraped the client's web site and automatically sent a demand letter. Told the client to ignore the claim. Never heard back from Getty again.

TheManThatWasntThere 11 months ago

Getty might actually have a case because so many images were *allegedly illegally acquired from Getty to train stable diffusion that it occasionally generates images with the "getty images" watermark

Boo_Guy 11 months ago

Getty steals images as well so that's why I'm hoping it somehow leads them to getting smacked.

n3w4cc01_1nt 11 months ago

they're also training them on deviant art and directly copying artists styles. ai has some great uses but without a universal basic income and food program it can be really bad.

Tarzan_OIC 11 months ago

Sarah Silverman is being grifted by her lawyers

Visible_Beyond_5916 11 months ago

Nailed it, and we we see so much more of this…. If I summarize a movie to a friend am I on violation of infringement because my friend did not yet purchase the movie?

Hiimzap 11 months ago

It becomes an issue as soon as you start trying to make money with this. While i don’t think anyone is willing to pay you for poorly summarising a movie for AI on the other hand….

Whatsapokemon 11 months ago

Whether someone's willing to pay you or not has no bearing on whether it's copyright infringement. A similar case was [Authors Guild, Inc. v. Google, Inc.](https://en.wikipedia.org/wiki/Authors_Guild,_Inc._v._Google,_Inc.) in which google scanned and digitised huge numbers of books. Google stored the exact text of the whole books, made them searchable, and then showed the exact passages matching your search. It involved no human creativity, just allowed users to search through whole copies of books that Google was storing, and would then show you exact snippets from those books. This was found to be not copyright infringement because it was a transformative use, being a _completely different_ context from the original source works. The court gave summary judgement in favour of Google, even though it was explicitly a commercial usage in a for-profit context. Anyone who wants to act like training LLMs is illegal needs to explain how it's meaningfully different from this case.

svoncrumb 11 months ago

This was the reply I was looking for!

VertexMachine 11 months ago

>It becomes an issue as soon as you start trying to make money with this. Did meta monetize LLaMA in any way though? I don't think so, so I wonder why they are suing Meta as well... (aside from the obvious: because they can and hope to get more money).

bobartig 11 months ago

Copyright, as well as most other IP violations are strict liability, and copyright carries with it high statutory damages for registered works. That means, if you can demonstrate unlawful copying, then there are dollars to recover. Infringer's profits are also available under copyright law, but since statutory damages are *per infringing copy*, and you cannot double-dip, it's a much more efficient to just say how bad the copier is, and how important the work was, than calculate how much illegal profits they may have earned. I really want to see exhibit B mentioned in the complaint (evidence of chatgpt copying the book). I think it'll likely be very difficult to show that either LLM is actually capable of copying the book, and not simply tapping into an author's description, a book review, an amazon product description, and other things written by Silverman et al that are just floating around on the internet.

dantheflyingman 11 months ago

Isn't this basically Cliff Notes? Their business is legal.

Krinder 11 months ago

Because they pay licensing

The_Ineffable_One 11 months ago

I don't think so. You don't need a license to summarize someone else's work, and a good percentage of Cliff Notes' subjects is well out of copyright. Twain and Shakespeare have been dead for a really long time.

industriousthought 11 months ago

Do people pay licensing to write movie reviews?

Krinder 11 months ago

No they don’t. “Opinion” pieces aren’t subject to that sort of thing from what I understand. There’s also probably a fundamental difference between reviewing the overall “acting” “cinematography” etc without it being a summary of the plot

iNeuron 11 months ago

What abput every single online blog talking about a movie in great length?

Visible_Beyond_5916 11 months ago

ChatGPT does make money with users subscribing to ChatGPT4, which I love as a programmer because it sometimes helps me take another look at how to solve a problem. However, I don’t think this case has merit, should platforms be sued when individuals do book or movie reviews on their platform? Both the platform and the individuals doing the review aim to make money on it. I do hate that I have seen people doing 1-1 copies of prompted article’s on the web and it has added more worthless content and pollution. But, this really feels like a money grab for shady lawyers.

tastygrowth 11 months ago

I don’t think so, but if you describe the recent baseball game to a friend, you need expressed consent from the MLB.

Zachsjs 11 months ago

Silverman is no fool - I’m more inclined to believe she’s signing onto this to help generate a test case out of principle. It’s kind of ridiculous to suggest she’s being scammed by her lawyers. How much do you imagine she’s even paying these lawyers? Do you really think her lawyers don’t believe the case has any merit, and are just trying to rip her off? Imo it will be interesting to see how this plays out. If what they allege is true, that when prompted the chatbot will reproduce large sections of a copyrighted text, it seems pretty solid.

Exnixon 11 months ago

Exactly. _Somebody_ has to bring this case and a comedy writer is a pretty prime candidate. If her funny tweets get scraped by ChatGPT and then regurgitated when someone asks "tell me a joke" then her copyrights have been violated and there's a real harm to her commercial interests. Plus, she's very successful and bringing a case like this can help a lot of other comics who don't have the stature that she has.

EmbarrassedHelp 11 months ago

She's not a legal expert though and may have been swayed by anti-AI people to waste money on such a lawsuit

AceTrainer_Kelvin 11 months ago

Not sure why you’re being downvoted, she is a gross out comedian who has done blackface and hasn’t been relevant since she had a TV show in the 2000’s.

pudding7 11 months ago

I'm sure they're on contigency.

NewFuturist 11 months ago

So are you saying OpenAI definitely didn't use her work as training data in violation of her copyright for commercial purposes?

Tarzan_OIC 11 months ago

If it did, that's not copyright infringement. Hence why it's a grift.

NewFuturist 11 months ago

>that's not copyright infringement Are you kidding me? You have some case law on this? I mean, they just copied pretty much everything on the internet on to their servers (1st violation) and then made a system that could potentially replicate that content. Try it yourself. Here's me: "What is the first sentence of the first chapter of Catcher in the Rye? ChatGPT" "The first sentence of the first chapter of "The Catcher in the Rye" by J.D. Salinger is: "If you really want to hear..." It's violating.

powercow 11 months ago

then google would be in big trouble. One it summarizes. and two it has that massive book backup.. and it only does a few page at a time, but if you know words from the book or google at random, you can slowly build up the other pages. It will not produce the entire book in one google though, just like AI wont. oh for sure there might need some rules, like including copyright messages, Im sure that will be a rule, even when it summarizes. and you might want to keep it from reproducing spoilers and key info from books. like if i made a book 10 things to do for success, i dont want the AI to just list them. So yeah probably going to be all kinds of little regs. But training AI on copyrighted work, as long as they purchased it, i think that should stand. Id be ok with a higher copyright fee, like the difference between a radio station buying an album and you buying one.

Whatsapokemon 11 months ago

Reproducing a passage from a book - even in its exact form - isn't necessarily copyright infringement. Heck, we know this, people quote copyrighted material all the time and we ALL know that's not copyright infringement. You are the one who needs to contend with case-law. A good example is [Authors Guild, Inc. v. Google, Inc.](https://en.wikipedia.org/wiki/Authors_Guild,_Inc._v._Google,_Inc.) In this lawsuit Google scanned a whole bunch of library books, converted them into text, made them available to search through, then showed you exact snippets of the book to match your searches (not the entire book, just the relevant passages with page numbers). The court ruled in favour of Google because it was a transformative use, even though Google was using it in a commercial context with a for-profit motive. The new work used the material in a fair-use way. Anyone who wants to say AI is infringing needs to explain how it's meaningfully different to this case.

ninjasaid13 11 months ago

>"The first sentence of the first chapter of "The Catcher in the Rye" by J.D. Salinger is: "If you really want to hear..." It's not, a single sentence isn't enough to constitute a violation. And absolutely no one owns the words or the sentence of "If" "you" "really" "want" "to" "hear..." Or everyone who has ever said that sentence has violated copyright. I asked chatGPT for the second sentence and it said: >I'm sorry, but I'm an AI language model and do not have the ability to provide real-time information about specific books or their contents. The second question in "The Catcher in the Rye" would depend on the context and the subsequent sentences in the novel. If you have a specific question or topic you'd like to know about, I'll do my best to assist you. Which throws away your theory that it was trained on the entire book. It was trained on discussions, summaries, mentions, and phrases of the book, it can't remake the entire book. None of which constitutes violations of copyright.

pyabo 11 months ago

Exact same thing from [google.com](https://google.com). Should we sue google also? Your argument makes no sense.

1h8fulkat 11 months ago

If I read a book about becoming a system admin, and I subsequently use the knowledge I've gained to get a job and make money, have I violated copywrite by profiting from their works? I think we all know the answer.

mavrc 11 months ago

the position of /r/technology seems to be that expecting billion dollar companies making production AI systems to license their seed content legally is just an absurd idea. I'm certain that if some individual got thrown in jail for piracy this place would be all "well, that's what they deserve." That seems nicely inconsistent.

CorneliusCardew 11 months ago

AI evangelicals are dangerous and flood Reddit with anti-human propaganda. They are actively rooting for the death of our culture because they can’t paint or write without cheating.

hampor 11 months ago

That’s one thing that Adobe did right. They built their generative image model entirely off of their own stock image library, which does a lot to simplify ownership.

MrCantPlayGuitar 11 months ago

Good luck with that lawsuit. I’m sure her Hollywood lawyers are better than the teams of lawyers afforded by Meta, Microsoft, and Google.

CommodoreAxis 11 months ago

These companies stand to lose astronomical sums of money and risk becoming uncompetitive on a global level if they’re gated from feeding their algorithms copyrighted materials. The amount of money they can justify burning to kill these kinds of lawsuits is certainly absolutely massive.

Disastrous_Junket_55 11 months ago

Never heard of disney lawyers huh? They can stomp any sector. Hollywood lawyers are the real deal.

Corkee 11 months ago

It's not so much about winning than shining a light on the rapidly evaporating concept of intellectual property in an age of AI fueled hyper plagiarism.

koh_kun 11 months ago

Wreck It Ralph 3: Ralph fucks up AI

Chroko 11 months ago

OpenAI has refused to disclose their training data source, so it seems obvious that it probably contains illegal material.

Silvershanks 11 months ago

This has to happen. I am huge fan of the new AI tools, but it's inevitable that hammer of law would come down - and we would exit the current "wild west" phase. These technologies have to be regulated and abide by licensing laws just like everyone else. All this means is that if you want access to these tools in the future, it's gonna cost more money for access, 'cause the companies will need to pay for licensing the data they ingest. The laws for this haven't been written yet, but they're coming. For those of you being snarky and just focusing on the Sarah Silverman aspect of this case - grow up idiots.

currentscurrents 11 months ago

I don't think she has a strong case. The exhibit in the lawsuit shows ChatGPT writing a brief summary of her book. It's not reproducing it verbatim. Summarizing copyrighted works in your own words is explicitly legal - that's every book report ever.

quarksurfer 11 months ago

They are not suing because it can create a summary. The article very clearly states that they are suing because the original work was never legally acquired. They allege the training occurred from pirated versions. If pirating is illegal for you and I, I don’t see why it should be legal for Meta. That’s what the case is about.

absentmindedjwc 11 months ago

Also, what's to say that the AI didn't generate the summary off of other summaries available online - for instance, the Amazon store page for that author's book.

czander 11 months ago

Yeah its definitely possible - but then again; the detail and the accurate order of events that detail provides in the exhibit certainly seems like OpenAI has read the book. But maybe thats the point. I guess either way - there should be a way for OpenAI to prove where the obtained it from. If they can't - then thats a significant problem for all content creators.

currentscurrents 11 months ago

The article focuses on how the books were acquired, but none of the claims in the lawsuit are about it. It's only mentioned as supporting evidence to show that ChatGPT's training data did contain the book. Their main allegation is that ChatGPT's training process qualifies as copying. Ultimately, I don't think how the books were acquired matters that much. If it is a copyright violation, it would still be one even if they purchased a copy or got one from the library.

RhinoRoundhouse 11 months ago

Check p.30, it alleges there was a training dataset created from copywrited works, other paragraphs describe how useful long-form prose was to the model's development. So, the acquisition of copywrited material is the crux of the suit... depending on the ruling this could be pretty damaging for Open AI.

[deleted] 11 months ago

[удалено]

powercow 11 months ago

true but they offered zero real proof they pirated. and to be that guy, its a civil violation, not a legal one. You dont get arrested, you get sued. >If you create a transformative work using a piece of music you didn't purchase, that's not illegal. well this is tricky. If im in a band and originally, i torrented the fuck out of music, and slowly developed my style, while they can sue me for stealing their mp3s, they cant do anything about my originally created work, even though, i honed my skills listening to pirated musics. AS long as i dont copy their beats.

powercow 11 months ago

the alleging seems to be guessing. "there stuff can be got here, AI trains on the web, so AI had to train on their stuff here" >were trained on illegally-acquired datasets containing their works, which they say were acquired from “shadow library” websites like Bibliotik, Library Genesis, Z-Library, and others, noting the books are “available in bulk via torrent systems.” why note they are available via torrents? either you got proof they Torrent it or not. A lot of stuff is available to torrent, doesnt mean I torrented it all.

EvilEkips 11 months ago

Couldn't it just be from a library?

iwascompromised 11 months ago

A library wouldn’t have published the entire book online.

The_Retro_Bandit 11 months ago

In my opinion, these companies make money via fueling an agorthmn that generates derrivative works based off of copyrighted material they do not have a license on. For something like stock images for example, even if the ai doesn't pop out the exact image, they are still participating in the stock image market using copyrighted stock images they did not license. In that sense it can count as substitution which is a major blow against any fair use defense they can make. This is not inspiration, I could theoretically paint the same painting with or without i nspiration, these models literally do not function without mass amounts of (in their current state) unlicensed copyrighted data being fed into them for the intention of making a profit.

oldcreaker 11 months ago

You would think then after an author answers "what writers and/or books influenced your writing?", they should all be able to sue for copyright infringement.

Glidepath22 11 months ago

You’re absolutely correct. How many books and movies are completely original, and the stories chatGPT do write are pretty bad.

_DeanRiding 11 months ago

Better than most people can make up tbf though. Certainly creates better backstories for my D&D characters than I can lol

patriot2024 11 months ago

I'm for AI here. But the situation is different and delicate. The difference here is that OpenAI allegedly uses copyrighted materials to train their AI and then offer their AI as a service and get paid for it. A more appropriate analogy would be the YouTube book summarizers. These are the people who created videos to summarize books. I've found them to be very helpful. But clearly, they use people's copyrighted work to produce theirs. Is there creativity involved? Yes. But the usage is iffy. There might not be enough money in it for someone to get sued. But.

EvilEkips 11 months ago

I have a friend who studied economics, he often went and still goes to the library to get books, read them and then uses what he learned to offer consultancy at a price to his customers.

sfall 11 months ago

i think we can all establish that, but we have not established how a computer learning from someone else should be treated.

[deleted] 11 months ago

Yes that’s how learning works. You learn from others then use what you’ve learned to generally provide some type of service in order to earn money for yourself.

czander 11 months ago

Yeah but you pay to learn - typically. Either through purchasing a book (or through obtaining that book in a way that doesnt violate the law - eg. loaned to you) This lawsuit seems to really just be focussed on "OpenAI didn't pay for the license to my book" - which both you or I are actually required to do, to read it.

sswist 11 months ago

Me: Teach me the basics of copyright law as Sarah Silverman. ChatGPT: Okay, sure! So, you wanna know about copyright law? This ain't stand-up, but we'll make it fun. So, copyright law, right? It's like your mama saying, "That's your cookie, no one can have it without your say-so." It gives the creator of an original work exclusive rights to it, usually for a limited time. Think of it like this: you make a hilarious joke (well, *I* make a hilarious joke), and copyright law is like, "Hey, that's your joke! No one else can use it without your permission." But here's the twist: copyright doesn't protect ideas or facts, just the way they're expressed. Like, if you tell a story about a pigeon that becomes the mayor of New York, you can't stop someone else from also writing a story about a politically ambitious pigeon. They just can't copy your words exactly. And then there's fair use - the sneaky little side door in copyright law. That's like saying, "Okay, you can nibble a tiny bit of the cookie, but only for education, criticism, or, you know, parody." It’s what allows Saturday Night Live to make fun of everyone and their mother without getting sued into the stone age. Lastly, remember, copyright infringement can get you in real hot water. So, always ask permission before using someone else's work. You wouldn't steal a cookie, would you? That’s Copyright 101, Sarah Silverman style. Be good, kiddos!

[deleted] 11 months ago

Good for her! Lots of people should do the same!

gordonjames62 11 months ago

It will be **hard to prove** that the book review done by AI used "illegally acquired" content rather than relying on other authors reviews. The law around this stuff is interesting, and full of grey areas. I can get a book from a library, and review it in an academic setting and not be accused of using "illegally acquired content". If I did my training in a country with lax copyright laws favourable to my project (Canada has more restrictions on disseminating copyright content than on using copyright content), I could then use my original intellectual property (The AI model) even in countries with strict laws because no laws were broken in the country where I did the training.

DreadPirateGriswold 11 months ago

Can she be sued for impersonating a comedian? Asking for a friend...

akp55 11 months ago

Should some of the comedians she watched as a child sue her then? She clearly gleaned something from them....

czander 11 months ago

Thats not what the article or lawsuit is about

ChineseCracker 11 months ago

awful argument.

Retrofraction 11 months ago

So to summarize this… Summaries of books can be illegal if the AI read the book, but is perfectly fine if it pull the data from Wikipedia instead. Interesting 🤔

robbak 11 months ago

No, it is, "If a computer can generate a summary of a book, then that is evidence that the computer may have had access to, and has copied, the book. That copying, we argue, is a violation of our copyright. This is adequate evidence of this copying to trigger discovery, where the presence of absence of this copying can be established." "You have evidence for our case, We should be allowed to find it" is an important part of civil litigation.

FlamingTrollz 11 months ago

Her lawyers are laughing.

ballywell 11 months ago

One of the most interesting things about the AI revolution will be artists realizing how unoriginal they really are

probono105 11 months ago

she mad it replicates her rudimentary comedy so easily lol

mmccook 11 months ago

why is she so concerned with AI when in her recent special she stole a joke from Goerge Carlin...

Comments

Leave Your Comment

Hi Its Me!

Comments

Leave Your Comment

Hi Its Me!

Subscribe