Welcome to DU!
The truly grassroots left-of-center political community where regular people, not algorithms, drive the discussions and set the standards.
Join the community:
Create a free account
Support DU (and get rid of ads!):
Become a Star Member
Latest Breaking News
Editorials & Other Articles
General Discussion
The DU Lounge
All Forums
Issue Forums
Culture Forums
Alliance Forums
Region Forums
Support Forums
Help & Search
General Discussion
In reply to the discussion: I was thinking about this "AI" stuff . . . [View all]highplainsdem
(59,307 posts)45. Just one study:
ChatGPTs Hallucination Problem: Study Finds More Than Half Of AIs References Are Fabricated Or Contain Errors In Model GPT-4o
https://studyfinds.org/chatgpts-hallucination-problem-fabricated-references/
A Deakin University study of mental health literature reviews found that ChatGPT (GPT-4o) fabricated roughly one in five academic citations, with more than half of all citations (56%) being either fake or containing errors.
The AIs accuracy varied dramatically by topic: depression citations were 94% real, while binge eating disorder and body dysmorphic disorder saw fabrication rates near 30%, suggesting less-studied subjects face higher risks.
Among fabricated citations that included DOIs, 64% linked to real but completely unrelated papers, making the errors harder to spot without careful verification.
The AIs accuracy varied dramatically by topic: depression citations were 94% real, while binge eating disorder and body dysmorphic disorder saw fabrication rates near 30%, suggesting less-studied subjects face higher risks.
Among fabricated citations that included DOIs, 64% linked to real but completely unrelated papers, making the errors harder to spot without careful verification.
I read a social media post the other day about another study showing a very high hallucination rate for AI summaries. Didn't bookmark it, so I don't have the link.
Hallucinations are inevitable with genAI.
Another study:
Language models cannot reliably distinguish belief from knowledge and fact
https://www.nature.com/articles/s42256-025-01113-8
As language models (LMs) increasingly infiltrate into high-stakes domains such as law, medicine, journalism and science, their ability to distinguish belief from knowledge, and fact from fiction, becomes imperative. Failure to make such distinctions can mislead diagnoses, distort judicial judgments and amplify misinformation. Here we evaluate 24 cutting-edge LMs using a new KaBLE benchmark of 13,000 questions across 13 epistemic tasks. Our findings reveal crucial limitations. In particular, all models tested systematically fail to acknowledge first-person false beliefs, with GPT-4o dropping from 98.2% to 64.4% accuracy and DeepSeek R1 plummeting from over 90% to 14.4%. Further, models process third-person false beliefs with substantially higher accuracy (95% for newer models; 79% for older ones) than first-person false beliefs (62.6% for newer; 52.5% for older), revealing a troubling attribution bias. We also find that, while recent models show competence in recursive knowledge tasks, they still rely on inconsistent reasoning strategies, suggesting superficial pattern matching rather than robust epistemic understanding. Most models lack a robust understanding of the factive nature of knowledge, that knowledge inherently requires truth. These limitations necessitate urgent improvements before deploying LMs in high-stakes domains where epistemic distinctions are crucial.
Edit history
Please sign in to view edit histories.
Recommendations
1 members have recommended this reply (displayed in chronological order):
69 replies
= new reply since forum marked as read
Highlight:
NoneDon't highlight anything
5 newestHighlight 5 most recent replies
RecommendedHighlight replies with 5 or more recommendations
False premise. NASA and the other entities involved used computers. They did not only use slide rules.
Celerity
Sunday
#4
Didn't John Glenn ask the women mathmaticians of "Hiden Figures" to do manual calculations
Deminpenn
Sunday
#5
Never saw that movie, but you said 'check the computer calculations' so computers were obviously used to a degree.
Celerity
Sunday
#7
No, the women checked the computers' outputs. Also see comments in this thread confirming that computers were used
Celerity
Yesterday
#58
Then anyone could "write" such a thesis because it would require minimal knowledge and the AI
highplainsdem
Sunday
#16
You can't enhance creativity with AI, any more than you enhance creativity asking someone else to
highplainsdem
Sunday
#9
Curious about what you mean when you say it inspires you. Do you mean you ask it for ideas?
highplainsdem
Sunday
#18
Okay, I'll give you an A+ for creativity just for writing a poem for a science communication workshop.
highplainsdem
Sunday
#49
Yours is one of the few nuanced takes I've read about one of the major faults with AI...
appmanga
Sunday
#54
Thanks, but I'm just trying to relay some of what I've heard from artists and writers and others
highplainsdem
19 hrs ago
#61
It isn't at all cool that AI is being widely used for cheating and students are learning less as a
highplainsdem
Sunday
#19
GenAI is never hallucination-free. I don't know where you got the idea that it is.
highplainsdem
Sunday
#23
It wasn't that long ago that Grok was identifying him as the main source of misinformation on X,
highplainsdem
Sunday
#46
You just contradicted what you said minutes ago about it being hallucination-free.
highplainsdem
Sunday
#31
I specifically said history topics along with other disciplines that don't change and are "set"
WarGamer
Sunday
#33
50 years ago if I told you I could hold a piece of glass and access global knowledge...
WarGamer
Sunday
#21
You don't know if it was "dead accurate" unless you took the time to check that those were the
highplainsdem
Sunday
#24
I find this discussion fascinating. It seems that the algorithm has figured out people are inherently lazy learners.
cayugafalls
Sunday
#38