General Discussion

highplainsdem

(63,622 posts)

45. Just one study:

Sun Dec 7, 2025, 10:52 PM

Dec 2025

ChatGPT’s Hallucination Problem: Study Finds More Than Half Of AI’s References Are Fabricated Or Contain Errors In Model GPT-4o
https://studyfinds.org/chatgpts-hallucination-problem-fabricated-references/

A Deakin University study of mental health literature reviews found that ChatGPT (GPT-4o) fabricated roughly one in five academic citations, with more than half of all citations (56%) being either fake or containing errors.

The AI’s accuracy varied dramatically by topic: depression citations were 94% real, while binge eating disorder and body dysmorphic disorder saw fabrication rates near 30%, suggesting less-studied subjects face higher risks.

Among fabricated citations that included DOIs, 64% linked to real but completely unrelated papers, making the errors harder to spot without careful verification.

I read a social media post the other day about another study showing a very high hallucination rate for AI summaries. Didn't bookmark it, so I don't have the link.

Hallucinations are inevitable with genAI.

Another study:

Language models cannot reliably distinguish belief from knowledge and fact
https://www.nature.com/articles/s42256-025-01113-8

As language models (LMs) increasingly infiltrate into high-stakes domains such as law, medicine, journalism and science, their ability to distinguish belief from knowledge, and fact from fiction, becomes imperative. Failure to make such distinctions can mislead diagnoses, distort judicial judgments and amplify misinformation. Here we evaluate 24 cutting-edge LMs using a new KaBLE benchmark of 13,000 questions across 13 epistemic tasks. Our findings reveal crucial limitations. In particular, all models tested systematically fail to acknowledge first-person false beliefs, with GPT-4o dropping from 98.2% to 64.4% accuracy and DeepSeek R1 plummeting from over 90% to 14.4%. Further, models process third-person false beliefs with substantially higher accuracy (95% for newer models; 79% for older ones) than first-person false beliefs (62.6% for newer; 52.5% for older), revealing a troubling attribution bias. We also find that, while recent models show competence in recursive knowledge tasks, they still rely on inconsistent reasoning strategies, suggesting superficial pattern matching rather than robust epistemic understanding. Most models lack a robust understanding of the factive nature of knowledge, that knowledge inherently requires truth. These limitations necessitate urgent improvements before deploying LMs in high-stakes domains where epistemic distinctions are crucial.

Edit history

Please sign in to view edit histories.

Recommendations

1 members have recommended this reply (displayed in chronological order):

Alice Kramden

69 replies

= new reply since forum marked as read

Highlight:

I was thinking about this "AI" stuff . . . [View all] AverageOldGuy Dec 2025 OP

Critical work was and is done slowly and carefully. usonian Dec 2025 #1

Now it is on my mind MuseRider Dec 2025 #2

I'm 56 and I've only seen a slide rule once or twice. Haggard Celine Dec 2025 #3

False premise. NASA and the other entities involved used computers. They did not only use slide rules. Celerity Dec 2025 #4

Didn't John Glenn ask the women mathmaticians of "Hiden Figures" to do manual calculations Deminpenn Dec 2025 #5

Never saw that movie, but you said 'check the computer calculations' so computers were obviously used to a degree. Celerity Dec 2025 #7

The computers... IcyPeas Dec 2025 #15

See comment 15: the "computers" were women. TommyT139 Dec 2025 #57

No, the women checked the computers' outputs. Also see comments in this thread confirming that computers were used Celerity Dec 2025 #58

Jmo, but a lot of "AI" seems just to be a rebrand of things that Deminpenn Dec 2025 #6

Its way more than that JCMach1 Dec 2025 #10

A lot is what is called AI isn't. Ms. Toad Dec 2025 #12

Then anyone could "write" such a thesis because it would require minimal knowledge and the AI highplainsdem Dec 2025 #16

I've seen a couple reports that AI has already revealed some dangers... buzzycrumbhunger Dec 2025 #53

. I've read a little about this, and here is what I think is going on.... reACTIONary Dec 2025 #55

AI technology is the new reality anciano Dec 2025 #8

You can't enhance creativity with AI, any more than you enhance creativity asking someone else to highplainsdem Dec 2025 #9

I don't think this is fair. mr715 Dec 2025 #14

Curious about what you mean when you say it inspires you. Do you mean you ask it for ideas? highplainsdem Dec 2025 #18

It did a psychic reading mr715 Dec 2025 #35

Cool 😎 .... anciano Dec 2025 #40

See reply 37. highplainsdem Dec 2025 #50

Okay, I'll give you an A+ for creativity just for writing a poem for a science communication workshop. highplainsdem Dec 2025 #49

Yours is one of the few nuanced takes I've read about one of the major faults with AI... appmanga Dec 2025 #54

Thanks, but I'm just trying to relay some of what I've heard from artists and writers and others highplainsdem Dec 2025 #61

I mean... the poem was well received mr715 Dec 2025 #62

GenAI is very good at mimicry. highplainsdem Dec 2025 #64

Some don't need it for creativity tinrobot Dec 2025 #48

We didn't do it with a slide rule DavidDvorkin Dec 2025 #11

I never used a slide rule. mr715 Dec 2025 #13

It isn't at all cool that AI is being widely used for cheating and students are learning less as a highplainsdem Dec 2025 #19

Students also learn more... WarGamer Dec 2025 #22

GenAI is never hallucination-free. I don't know where you got the idea that it is. highplainsdem Dec 2025 #23

It's because history is set. WarGamer Dec 2025 #25

See reply 26. highplainsdem Dec 2025 #27

If you ask Grok mr715 Dec 2025 #41

It wasn't that long ago that Grok was identifying him as the main source of misinformation on X, highplainsdem Dec 2025 #46

Now it does funny stuff mr715 Dec 2025 #47

Yes. Smarter than Einstein, and more fit than LeBron James: highplainsdem Dec 2025 #51

Re hallucinations - see this article: highplainsdem Dec 2025 #26

Yup I'm a Gemini Pro 3.0 power user... since day 1 and version 1. WarGamer Dec 2025 #29

You just contradicted what you said minutes ago about it being hallucination-free. highplainsdem Dec 2025 #31

I specifically said history topics along with other disciplines that don't change and are "set" WarGamer Dec 2025 #33

The topic doesn't matter. All genAI models can hallucinate on any topic. highplainsdem Dec 2025 #34

*can yes... WarGamer Dec 2025 #39

Just one study: highplainsdem Dec 2025 #45

And see these threads about AI and hallucinations: highplainsdem Dec 2025 #28

The undergrads I teach mr715 Dec 2025 #36

We need to adapt. mr715 Dec 2025 #42

I'll be the first to say it: What's a slide rule? Polybius Dec 2025 #17

Wikipedia is very useful: highplainsdem Dec 2025 #20

Hey, never cite wikipedia! mr715 Dec 2025 #43

50 years ago if I told you I could hold a piece of glass and access global knowledge... WarGamer Dec 2025 #21

You don't know if it was "dead accurate" unless you took the time to check that those were the highplainsdem Dec 2025 #24

I did... I back checked it. WarGamer Dec 2025 #32

Cool 😎.... anciano Dec 2025 #30

Not exactly. highplainsdem Dec 2025 #37

I find this discussion fascinating. It seems that the algorithm has figured out people are inherently lazy learners. cayugafalls Dec 2025 #38

It is a fancy autocorrect mr715 Dec 2025 #44

Like it or not, if you have a job interview these days you better have an AI story/strategy underpants Dec 2025 #52

In some professions use of AI is a badge of dishonor. highplainsdem Dec 2025 #56

This message was self-deleted by its author jfz9580m Dec 2025 #59

We weren't allowed to use a slide rule in school. That was cheating. Emile Dec 2025 #60

Did you memorize logs? nt mr715 Dec 2025 #65

We had to walk 5 miles barefoot in snow to school too. Emile Dec 2025 #67

Uphill both ways. mr715 Dec 2025 #68

LOL, that's right 👍. Emile Dec 2025 #69

I wonder the same about calculators Torchlight Dec 2025 #63

And grad students. nt mr715 Dec 2025 #66