Welcome to DU! The truly grassroots left-of-center political community where regular people, not algorithms, drive the discussions and set the standards. Join the community: Create a free account Support DU (and get rid of ads!): Become a Star Member Latest Breaking News General Discussion The DU Lounge All Forums Issue Forums Culture Forums Alliance Forums Region Forums Support Forums Help & Search

BootinUp

(47,141 posts)
Mon May 29, 2023, 07:49 PM May 2023

AI tools like ChatGPT are built on mass copyright infringement

ZAINAB CHOUDHRY
CONTRIBUTED TO THE GLOBE AND MAIL
PUBLISHED MAY 25, 2023
UPDATED MAY 26, 2023

Zainab Choudhry is a startup founder who has worked in law, technology and media in New York and Toronto.


The old world, wherein only human minds had evolved to create stories, art, music and poetry, is no more.

We have now entered the era of generative artificial intelligence, a type of AI that can create new content based on datasets of existing content it has been “fed” and trained on. In this new era, generative AI developers are building the minds of these machines by training them on content created by humans of the past and present, from Shakespeare to Atwood, Caravaggio to Koons. Thus far, we have marvelled at the creations that generative AI tools such as ChatGPT have produced, but this use of AI raises crucial ethical and legal questions.

It takes enormous amounts of data to train a generative AI program like ChatGPT, and in order to build these tools cheaply and quickly, developers are committing mass copyright infringement. These datasets are largely created by combing and scraping the internet for every type of content, from articles, books and artwork to our photos and tweets. These methods give rise to some big questions: Is the use of our copyright-protected content for training generative AI models legal? Does the use of copyrighted content for training AI fall under fair-use exceptions in the United States and fair dealing in Canada? Do we have a right to compensation when our work is being fed to the machines?

As a former copyright startup founder equipped with a law degree and a long-standing career at the intersection of intellectual property (IP) law, media and tech, I know the rules broadly boil down to one central tenet: To use someone else’s original content, you must get their permission, barring some exceptions. In my opinion, using copyrighted content to train a generative AI, without permission, easily falls under copyright infringement. If you train a generative AI model on the content of a particular painter or poet’s work, or even a singer’s voice, the AI can do a pretty good job of replicating the exact content and style of those paintings, poems or vocals in the new works it creates. At its lightning speed, generative AI can train on and write a new book based on an author’s work long before the human author ever could.

Continued

22 replies = new reply since forum marked as read
Highlight: NoneDon't highlight anything 5 newestHighlight 5 most recent replies
AI tools like ChatGPT are built on mass copyright infringement (Original Post) BootinUp May 2023 OP
+1,000,000,000,000,000,000 highplainsdem May 2023 #1
ive said it before. neither artificial or intelligent. just well constructed algorithms. bullimiami May 2023 #2
Exactly. anciano May 2023 #4
An AI just found a powerful antibiotic. Maybe no original thought Pisces May 2023 #8
its a great tool. ability to sift through masses of data and identify patterns. bullimiami May 2023 #16
That's as nonsensical as expecting submarines to swim. AI are artificial and they go beyond Bernardo de La Paz May 2023 #11
Another downside: it makes things up! An attorney got in trouble when it invented cases: Liberty Belle May 2023 #3
AI will trample Copyright to pulp liberal N proud May 2023 #5
That is why AI is just drab regurgitation and data dump bucolic_frolic May 2023 #6
Oh sure. Regurgitation and data dump just found a super-bug antibiotic no human found. . . . nt Bernardo de La Paz May 2023 #10
I think an argument could be made that as humans we consume these Pisces May 2023 #7
No. We train people at universities using copyright protected content Bernardo de La Paz May 2023 #9
You are so ready to make the AI computer equivalent BootinUp May 2023 #12
No. Nothing in my posts is what you are trying to make me say. Bernardo de La Paz May 2023 #13
The only way I can interpret BootinUp May 2023 #14
If you want something or someone to learn about human cultural heritage Bernardo de La Paz May 2023 #15
So I recognized the logic behind your post BootinUp May 2023 #17
How long should copyrights last? Effete Snob May 2023 #18
That is not what I was discussing and I think it is irrelevant BootinUp May 2023 #19
Right Effete Snob May 2023 #20
You know something, I don;'t like the way you tried to pin some shit on me. BootinUp May 2023 #21
If you want to discuss ths subject let me know. BootinUp May 2023 #22

bullimiami

(13,084 posts)
2. ive said it before. neither artificial or intelligent. just well constructed algorithms.
Mon May 29, 2023, 08:01 PM
May 2023

there are no original ideas here.
just the ability to scan a vast repository of existing information and repackage it according to programming.

bullimiami

(13,084 posts)
16. its a great tool. ability to sift through masses of data and identify patterns.
Mon May 29, 2023, 09:37 PM
May 2023

just an advancement in programming.
more processing power and availability of data has made it possible.

Bernardo de La Paz

(48,988 posts)
11. That's as nonsensical as expecting submarines to swim. AI are artificial and they go beyond
Mon May 29, 2023, 09:12 PM
May 2023

AIs go beyond their algorithms the same way Einstein went beyond Newton: by making connections that other people did not see.

bucolic_frolic

(43,128 posts)
6. That is why AI is just drab regurgitation and data dump
Mon May 29, 2023, 08:21 PM
May 2023

If you ask AI detailed questions, you soon find out it is a bit shallow. Don't expect real thinking or interpretation. I've stumped it several times. It just doesn't "know" what to think because it's a rehash of others' thinking.

Pisces

(5,599 posts)
7. I think an argument could be made that as humans we consume these
Mon May 29, 2023, 08:21 PM
May 2023

Works and develop things from our own library of knowledge taken from others. These works are purchased at some point and then input or taken from the internet where some works are free. I don’t like AI or what can come of it in the future, but I’m not sure copyright infringement will apply.

Bernardo de La Paz

(48,988 posts)
9. No. We train people at universities using copyright protected content
Mon May 29, 2023, 09:09 PM
May 2023

It is as if the writer wants artists to never see a Picasso or Van Gogh or Dali.

BootinUp

(47,141 posts)
12. You are so ready to make the AI computer equivalent
Mon May 29, 2023, 09:15 PM
May 2023

to a human. Why? They are as much alike as an amoeba is to a rock.

Bernardo de La Paz

(48,988 posts)
13. No. Nothing in my posts is what you are trying to make me say.
Mon May 29, 2023, 09:17 PM
May 2023

It is you who want a tractor to be alike a donkey, by your logic.

Of course humans and AI are different, duh.

BootinUp

(47,141 posts)
14. The only way I can interpret
Mon May 29, 2023, 09:25 PM
May 2023
No. We train people at universities using copyright protected content


It is as if the writer wants artists to never see a Picasso or Van Gogh or Dali.


Is that you think it should be ok to "train" an AI computer on copyrighted works because humans are trained that way.

Bernardo de La Paz

(48,988 posts)
15. If you want something or someone to learn about human cultural heritage
Mon May 29, 2023, 09:30 PM
May 2023

... you have to train the someone or something with copyright protected content.

Technically, though Picasso, VG and Dali did many or all of their important works before 1927, the photographs of their work is protected copyright content.

If you won't forbid an African American woman from studying Van Gogh, why would you prevent an AI?



Then flip it. How would you expect an AI to have any knowledge or understanding of Van Gogh's impact without studying his paintings?

How would you expect an AI to have ANY level of comprehension of "Imagine" by John Lennon without hearing the copyrighted recording or without reading the copyrighted lyrics? Or do you expect it to learn everything about it only by reading copyright protected reviews in music magazines and copyright protected Ph.D. theses on popular music?

BootinUp

(47,141 posts)
17. So I recognized the logic behind your post
Mon May 29, 2023, 09:44 PM
May 2023

and then I objected to the idea. Seems like we can leave it there.

 

Effete Snob

(8,387 posts)
18. How long should copyrights last?
Mon May 29, 2023, 10:12 PM
May 2023

Just give the number of years.

Also, do you know how long copyright lasted when you were born?

Do you know the rationale behind making copyright term limited instead of lasting forever?

BootinUp

(47,141 posts)
19. That is not what I was discussing and I think it is irrelevant
Mon May 29, 2023, 10:20 PM
May 2023

to what I was discussing. To be relevant the subject would be about advocating for a change to copyright laws. I was not.

 

Effete Snob

(8,387 posts)
20. Right
Mon May 29, 2023, 10:44 PM
May 2023

Which means that you are happy with the current terms.

The post to which you responded mentioned several artists - none of whose work is subject to copyright.

The creation of new works requires the use of existing works, and of course authors, artists and others in the creative arts are familiar with their fields and influenced by works and artists which they have studied to achieve mastery in their fields.

Your point is that this is okay, provided that all of the work used was produced by (a) someone who died at least 70 years ago or (b) is owned by a corporation and was first published more than 95 years ago. In many places outside of the US, then your position is that it should have been produced by someone who died at least 50 years ago.

But your position relies on the prevailing copyright term laws to define what you believe is, or is not, material which can or should be used to train AI models.

BootinUp

(47,141 posts)
21. You know something, I don;'t like the way you tried to pin some shit on me.
Mon May 29, 2023, 10:50 PM
May 2023

I could have more carefully stated why its not relevant, but I am tired.

Latest Discussions»Issue Forums»Editorials & Other Articles»AI tools like ChatGPT are...