General Discussion

highplainsdem

(62,786 posts) Tue Jan 27, 2026, 11:54 AM Jan 2026

How Silicon Valley built AI: Buying, scanning and discarding millions of books

https://www.washingtonpost.com/technology/2026/01/27/anthropic-ai-scan-destroy-books/

In early 2024, executives at artificial intelligence start-up Anthropic ramped up an ambitious project they sought to keep quiet. “Project Panama is our effort to destructively scan all the books in the world,” an internal planning document unsealed in legal filings last week said. “We don’t want it to be known that we are working on this.”

Within about a year, according to the filings, the company had spent tens of millions of dollars to acquire and slice the spines off millions of books, before scanning their pages to feed more knowledge into the AI models behind products such as its popular chatbot Claude.

-snip-

Books were viewed by the companies as a crucial prize, the court records show. In a January 2023 document, one Anthropic co-founder theorized that training AI models on books could teach them “how to write well” instead of mimicking “low quality internet speak.” A 2024 email inside Meta described accessing a digital trove of books as “essential” to being competitive with its AI rivals.

-snip-

On several occasions, Meta employees raised concerns in internal messages that downloading a collection of millions of books without permission would violate copyright law. In December 2023, an internal email said the practice had been approved after “escalation to MZ,” an apparent reference to CEO Mark Zuckerberg, according to filings in a copyright lawsuit brought by book authors against the company. Meta declined to comment for this story.

-snip-

Much more at the link.

To the best of my knowledge, there is no such thing as an ethical, legally trained generative AI model.

No such thing as an ethical genAI company.

No such thing as an ethical genAI tech executive, company owner/investor or staffer, including scientists, who knew of the intellectual property theft and went along with it.

The training of all these AI models involved the greatest theft of intellectual property ever.

If you're aware of that theft, you should NOT be using genAI voluntarily, or promoting its use, including by circulating what's produced by genAI - whethet it's text, images, video or music. Because if you do so, you're giving a thumbs-up to the theft, and to thieves who belong in prison.

I know some people are forced by their schools or jobs to use genAI. They should still point out that it's unethical, just as I hope they would if child labor or slavery was involved.

EDITING to link to two threads about the very appropriate reaction on Bluesky to a teacher's union head having foolishly posted AI slop she thought was "fun" -

American Federation of Teachers president thought AI slop would be "fun" to share on Bluesky. Big mistake.
https://www.democraticunderground.com/100220895596

If you support unions (DUers should) but still think it's OK to post AI slop, see the hundreds of Bluesky replies
https://www.democraticunderground.com/100220895856

21 replies

= new reply since forum marked as read

Highlight:

How Silicon Valley built AI: Buying, scanning and discarding millions of books (Original Post) highplainsdem Jan 2026 OP

Damn, that is really disturbing PatSeg Jan 2026 #1

It should be. An entire industry built on theft, with the theft continuing, and both the companies and highplainsdem Jan 2026 #3

It is all so big and happening so fast PatSeg Jan 2026 #5

Silicon Valley SOP is "Move fast and break things" - but with genAI they added "and steal things." They've highplainsdem Jan 2026 #8

"Move fast" PatSeg Jan 2026 #12

🤬🤬🤬🤬🤬🤬🤬🤬🤬🤬🤬🤬🤬🤬 SheltieLover Jan 2026 #2

I agree, Sheltie. highplainsdem Jan 2026 #4

I know SheltieLover Jan 2026 #6

Thanks! highplainsdem Jan 2026 #9

A must read. dalton99a Jan 2026 #7

Thanks! highplainsdem Jan 2026 #15

In theory if they actually paid for all the works used, would it be ethical? EdmondDantes_ Jan 2026 #10

No, it wouldn't be ethical or legal. highplainsdem Jan 2026 #11

How does that differ from people being able to do the same? EdmondDantes_ Jan 2026 #13

Thanks for spotting that I'd botched that link. Just fixed it. highplainsdem Jan 2026 #16

AI is here to make us all dumber and lazier so the oligarchs can do whatever they want. Coventina Jan 2026 #14

Each genAI company seems to be trying to create a chatbot that users will become addicted to and highplainsdem Jan 2026 #18

Or to know anything we aren't spoon-fed. Coventina Jan 2026 #20

A lot don't use AI and don't pay attention to it for that reason. And too many of the ones who do use highplainsdem Jan 2026 #21

Kick and Rec berniesandersmittens Jan 2026 #17

Thanks! highplainsdem Jan 2026 #19

PatSeg

(53,330 posts)

1. Damn, that is really disturbing

Reply to highplainsdem (Original post)

Tue Jan 27, 2026, 12:04 PM

Jan 2026

highplainsdem

(62,786 posts)

3. It should be. An entire industry built on theft, with the theft continuing, and both the companies and

Reply to PatSeg (Reply #1)

Tue Jan 27, 2026, 01:09 PM

Jan 2026

the government pressuring everyone to use their unethical AND badly flawed - hallucinating - tools.

PatSeg

(53,330 posts)

5. It is all so big and happening so fast

Reply to highplainsdem (Reply #3)

Tue Jan 27, 2026, 01:12 PM

Jan 2026

Most people won't even realize it until it is too late.

highplainsdem

(62,786 posts)

8. Silicon Valley SOP is "Move fast and break things" - but with genAI they added "and steal things." They've

Reply to PatSeg (Reply #5)

Tue Jan 27, 2026, 01:24 PM

Jan 2026

all been very aware it's theft. They're just counting on having skillful but unethical lawyers, unethical governments, and a lazy and largely unethical public who won't know or won't care about the IP theft and other harms from the genAI industry, as long as they find genAI even slightly useful or entertaining. And it's that need to make suckers, gullible users, fans of genAI that explains why companies losing money on genAI are still offering it for little or nothing. They're trying to create a situation where genAI companies are considered too important to regulate, and too big to fail so goverments should subsidize them and bail them out.

PatSeg

(53,330 posts)

12. "Move fast"

Reply to highplainsdem (Reply #8)

Tue Jan 27, 2026, 06:41 PM

Jan 2026

It is the speed at which it is happening that I find especially unnerving. By the time most people realize what the consequences are, it will be too big and too late to stop it.

SheltieLover

(81,422 posts)

2. 🤬🤬🤬🤬🤬🤬🤬🤬🤬🤬🤬🤬🤬🤬

Reply to highplainsdem (Original post)

Tue Jan 27, 2026, 12:05 PM

Jan 2026

highplainsdem

(62,786 posts)

4. I agree, Sheltie.

Reply to SheltieLover (Reply #2)

Tue Jan 27, 2026, 01:09 PM

Jan 2026

SheltieLover

(81,422 posts)

6. I know

Reply to highplainsdem (Reply #4)

Tue Jan 27, 2026, 01:16 PM

Jan 2026

My anger is def not directed at you.

highplainsdem

(62,786 posts)

9. Thanks!

Reply to SheltieLover (Reply #6)

Tue Jan 27, 2026, 01:26 PM

Jan 2026

dalton99a

(94,950 posts)

7. A must read.

Reply to highplainsdem (Original post)

Tue Jan 27, 2026, 01:21 PM

Jan 2026

highplainsdem

(62,786 posts)

15. Thanks!

Reply to dalton99a (Reply #7)

Tue Jan 27, 2026, 07:03 PM

Jan 2026

EdmondDantes_

(1,992 posts)

10. In theory if they actually paid for all the works used, would it be ethical?

Reply to highplainsdem (Original post)

Tue Jan 27, 2026, 01:40 PM

Jan 2026

Or is the ability to intake everything still a scope problem? Plenty of developers for example put code up on GitHub or StackOverflow and shared knowledge within the community freely. But AI can ingest that far faster than I can. Or is it because it can synthesize and spread all of that data to a scope that wasn't possible before even with the Internet? Movie companies/record companies made arguments about VCRs and tape recorders, but the ability to distribute is so great. An AI can buy a single copy of say every computer science book and then output all that learning to anyone/everyone.

highplainsdem

(62,786 posts)

11. No, it wouldn't be ethical or legal.

Reply to EdmondDantes_ (Reply #10)

Tue Jan 27, 2026, 01:50 PM

Jan 2026

Last edited Tue Jan 27, 2026, 07:07 PM - Edit history (1)

https://futurism.com/artificial-intelligence/ai-industry-recall-copyright-books

https://arxiv.org/abs/2601.02671

EdmondDantes_

(1,992 posts)

13. How does that differ from people being able to do the same?

Reply to highplainsdem (Reply #11)

Tue Jan 27, 2026, 06:54 PM

Jan 2026

Lack of direct citations? But that's not present in art. And even in books like Lord of the Rings and how much Sword of Shannara is beat for beat similar, or how artists tend to make paintings or sculptures that can clearly be put into closely related categories?

I'm not saying I think AI is good, but I honestly don't know where I draw the line.

And the first link goes to a 404, so I think you meant this article

https://futurism.com/artificial-intelligence/ai-industry-recall-copyright-books

It's interesting and if they can't make it stop spitting out copyright material, obviously bad. And I think that it's pretty unlikely that they did actually buy the books, instead just pirating them online, so it's more a theoretical question

highplainsdem

(62,786 posts)

16. Thanks for spotting that I'd botched that link. Just fixed it.

Reply to EdmondDantes_ (Reply #13)

Tue Jan 27, 2026, 07:18 PM

Jan 2026

Here's an article I've found helpful for explaining the difference between machine and human learning:

https://www.tomshardware.com/news/ai-doesnt-learn-like-people-do

Coventina

(29,864 posts)

14. AI is here to make us all dumber and lazier so the oligarchs can do whatever they want.

Reply to highplainsdem (Original post)

Tue Jan 27, 2026, 06:59 PM

Jan 2026

You've been warned.

highplainsdem

(62,786 posts)

18. Each genAI company seems to be trying to create a chatbot that users will become addicted to and

Reply to Coventina (Reply #14)

Tue Jan 27, 2026, 07:32 PM

Jan 2026

dependent on - that we'll view as both a wise teacher and the world's most sympathetic friend. That will give the companies much more data on us, and the ability to manipulate us better with every word we say to the chatbot. Eventually we'll just be docile consumers with very few thoughts that didn't originate with the chatbot. And we're told this will lift the "cognitive burden" of having to think for ourselves.

Coventina

(29,864 posts)

20. Or to know anything we aren't spoon-fed.

Reply to highplainsdem (Reply #18)

Tue Jan 27, 2026, 07:35 PM

Jan 2026

I don't understand why more people aren't horrified by this.

highplainsdem

(62,786 posts)

21. A lot don't use AI and don't pay attention to it for that reason. And too many of the ones who do use

Reply to Coventina (Reply #20)

Tue Jan 27, 2026, 08:28 PM

Jan 2026

it are just too dazzled by AI supposedly "democratizing creativity" so they can use it to generate text, images, video and/or music in seconds. Others almost immediately succumb to chatbots flattering them.

Others just think it's too useful in terms of summarizing, coding, etc., in less time, but it's rarely less time if they check the AI results for errors. A lot of AI users never check, though.

It appeals to people's egos and laziness, and can reel in people who are lonely.

berniesandersmittens

(13,211 posts)

17. Kick and Rec

Reply to highplainsdem (Original post)

Tue Jan 27, 2026, 07:22 PM

Jan 2026

highplainsdem

(62,786 posts)

19. Thanks!

Reply to berniesandersmittens (Reply #17)

Tue Jan 27, 2026, 07:33 PM

Jan 2026

Reply to this discussion