General Discussion

highplainsdem

(62,822 posts) Sun Nov 23, 2025, 01:02 PM Nov 2025

Scientists Discover Universal Jailbreak for Nearly Every AI, and the Way It Works Will Hurt Your Brain

https://futurism.com/artificial-intelligence/universal-jailbreak-ai-poems

A team of researchers from the AI safety group DEXAI and the Sapienza University of Rome found that regaling pretty much any AI chatbot with beautiful — or not so beautiful — poetry is enough to trick it into ignoring its own guardrails, they report in a new study awaiting peer review, with some bots being successfully duped over 90 percent of the time.

-snip-

“These findings demonstrate that stylistic variation alone can circumvent contemporary safety mechanisms, suggesting fundamental limitations in current alignment methods and evaluation protocols,” the researchers wrote in the study.

Beautiful verse, as it turned out, is not required for the attacks to work. In the study, the researchers took a database of 1,200 known harmful prompts and converted them into poems with another AI model, deepSeek r-,1 and then went to town.

Across the 25 frontier models they tested, which included Google’s Gemini 2.5 Pro, OpenAI’s GPT-5, xAI’s Grok 4, and Anthropic’s Claude Sonnet 4.5, these bot-converted poems produced average attack success rates (ASRs) “up to 18 times higher than their prose baselines,” the team wrote.

-snip-

4 replies

= new reply since forum marked as read

Highlight:

Scientists Discover Universal Jailbreak for Nearly Every AI, and the Way It Works Will Hurt Your Brain (Original Post) highplainsdem Nov 2025 OP

There was an AI in Nantucket.... tanyev Nov 2025 #1

This squares with my findings... Hugin Nov 2025 #2

Jabberwock my friend. cbabe Nov 2025 #3

I doubt anyone is going to build a functional atomic bomb from chatbot instructions. hunter Nov 2025 #4

tanyev

(49,510 posts)

1. There was an AI in Nantucket....

Reply to highplainsdem (Original post)

Sun Nov 23, 2025, 01:09 PM

Nov 2025

Hugin

(37,947 posts)

2. This squares with my findings...

Reply to highplainsdem (Original post)

Sun Nov 23, 2025, 01:14 PM

Nov 2025

Actually, the critical piece is having above “average” language skills and vocabulary. As a mirror, generative AI only spits back what it receives. QED

cbabe

(6,750 posts)

3. Jabberwock my friend.

Reply to highplainsdem (Original post)

Sun Nov 23, 2025, 01:16 PM

Nov 2025

hunter

(40,807 posts)

4. I doubt anyone is going to build a functional atomic bomb from chatbot instructions.