Welcome to DU! The truly grassroots left-of-center political community where regular people, not algorithms, drive the discussions and set the standards. Join the community: Create a free account Support DU (and get rid of ads!): Become a Star Member Latest Breaking News Editorials & Other Articles General Discussion The DU Lounge All Forums Issue Forums Culture Forums Alliance Forums Region Forums Support Forums Help & Search

highplainsdem

(63,107 posts)
Thu Apr 24, 2025, 08:33 PM Apr 2025

OpenAI's new reasoning AI models hallucinate more

https://techcrunch.com/2025/04/18/openais-new-reasoning-ai-models-hallucinate-more/

OpenAI’s recently launched o3 and o4-mini AI models are state-of-the-art in many respects. However, the new models still hallucinate, or make things up — in fact, they hallucinate more than several of OpenAI’s older models.

Hallucinations have proven to be one of the biggest and most difficult problems to solve in AI, impacting even today’s best-performing systems. Historically, each new model has improved slightly in the hallucination department, hallucinating less than its predecessor. But that doesn’t seem to be the case for o3 and o4-mini.

-snip-

OpenAI found that o3 hallucinated in response to 33% of questions on PersonQA, the company’s in-house benchmark for measuring the accuracy of a model’s knowledge about people. That’s roughly double the hallucination rate of OpenAI’s previous reasoning models, o1 and o3-mini, which scored 16% and 14.8%, respectively. O4-mini did even worse on PersonQA — hallucinating 48% of the time.

Third-party testing by Transluce, a nonprofit AI research lab, also found evidence that o3 has a tendency to make up actions it took in the process of arriving at answers. In one example, Transluce observed o3 claiming that it ran code on a 2021 MacBook Pro “outside of ChatGPT,” then copied the numbers into its answer. While o3 has access to some tools, it can’t do that.

-snip-
2 replies = new reply since forum marked as read
Highlight: NoneDon't highlight anything 5 newestHighlight 5 most recent replies
OpenAI's new reasoning AI models hallucinate more (Original Post) highplainsdem Apr 2025 OP
I wish ai were illegal. SheltieLover Apr 2025 #1
That says a lot. AI draws from our society for cachukis Apr 2025 #2

cachukis

(4,083 posts)
2. That says a lot. AI draws from our society for
Thu Apr 24, 2025, 08:49 PM
Apr 2025

answers. Perhaps part of our social consciousness is a bit off.

Kick in to the DU tip jar?

This week we're running a special pop-up mini fund drive. From Monday through Friday we're going ad-free for all registered members, and we're asking you to kick in to the DU tip jar to support the site and keep us financially healthy.

As a bonus, making a contribution will allow you to leave kudos for another DU member, and at the end of the week we'll recognize the DUers who you think make this community great.

Tell me more...

Latest Discussions»General Discussion»OpenAI's new reasoning AI...