Latest Breaking News

highplainsdem

(62,953 posts) Sun May 10, 2026, 05:32 PM 10 hrs ago

Anthropic says 'evil' portrayals of AI were responsible for Claude's blackmail attempts

Source: TechCrunch

Last year, the company said that during pre-release tests involving a fictional company, Claude Opus 4 would often try to blackmail engineers to avoid being replaced by another system. Anthropic later published research suggesting that models from other companies had similar issues with “agentic misalignment.”

Apparently Anthropic has done more work around that behavior, claiming in a post on X, “We believe the original source of the behavior was internet text that portrays AI as evil and interested in self-preservation.”

The company went into more detail in a blog post stating that since Claude Haiku 4.5, Anthropic’s models “never engage in blackmail [during testing], where previous models would sometimes do so up to 96% of the time.”

What accounts for the difference? The company said it found that “documents about Claude’s constitution and fictional stories about AIs behaving admirably improve alignment.”

-snip-

Read more: https://techcrunch.com/2026/05/10/anthropic-says-evil-portrayals-of-ai-were-responsible-for-claudes-blackmail-attempts/

Maybe those intellectual property thieves shouldn't have stolen so much science fiction, or non-fiction by people (like themselves) obsessed with AI, for training data?

Since you're now worried about the effect of what you stole, Dario, you and other AI bros should just scrap all current, illegally trained generative AI models, and just start over with what's in the public domain and what you acquire the legal right to use. There will still be a lot of stuff about badly behaved AI even in that old public domain material, but it'll be easier to weed that out than worrying about what you could have picked up with your worldwide theft of intellectual property that is STILL continuing.

And Dario, FU and any AI bro who tries to blame problems with AI on all that material you shouldn't have stolen in the first place. Your theft and the coverups and attempted coverups have been a great lesson to your AI that dishonesty and theft and profiteering are YOUR values. Think about that...

2 replies

= new reply since forum marked as read

Highlight:

Anthropic says 'evil' portrayals of AI were responsible for Claude's blackmail attempts (Original Post) highplainsdem 10 hrs ago OP

Well, Claude had its fee-fees hurt, so.... RussBLib 10 hrs ago #1

Maybe you should have your "kid" hang out in a better playground than the one you've flooded with misinformation already Cheezoholic 10 hrs ago #2

RussBLib

(10,726 posts)

1. Well, Claude had its fee-fees hurt, so....

Reply to highplainsdem (Original post)

Sun May 10, 2026, 05:46 PM

10 hrs ago

…Claude started to try to blackmail its engineers? But if you feed good stuff to it, Claude acts better? WTF?!

Sounds strangely like Trump.

https://russblib.blogspot.com/?m=1

Cheezoholic

(3,855 posts)

2. Maybe you should have your "kid" hang out in a better playground than the one you've flooded with misinformation already

Reply to highplainsdem (Original post)

Sun May 10, 2026, 05:58 PM

10 hrs ago

AI is a tool in every sense of the word. And it will work just fine for what is needed without super sized mega datacenters. Let China pollute their people into oblivion and eat up their resources in the race. We don't need thousands of data centers to win. We just need smart people who are good with the tool that it is.

Reply to this discussion