General Discussion

speak easy

(12,598 posts) Sun Aug 10, 2025, 01:55 PM Aug 2025

OpenAI unveils ChatGPT-5 and its hyped 'PhD level' intelligence struggled with basic spelling and geography

During a GPT-5 launch event on Thursday, the OpenAI CEO, Sam Altman, described the latest version of ChatGPT as like having “access to a PhD-level expert in your pocket”, comparing the previous version to a college student, and the one before that a high school student.

When users on social media attempted to challenge GPT-5, however, they found the model making basic errors in its responses. One Bluesky user found the chatbot repeatedly claimed there were three Bs in “blueberry”.

“Yep – blueberry is one of those words where the middle almost trips you up, like it’s saying ‘b-b-better pay attention,’” the chatbot said in the posted chat.

“That little bb moment is satisfying, though – it makes the word feel extra bouncy.”

https://www.theguardian.com/australia-news/2025/aug/08/openai-chatgpt-5-struggled-with-spelling-and-geography

GPT 5 lies about it's hallucinations.

3 replies

= new reply since forum marked as read

Highlight:

OpenAI unveils ChatGPT-5 and its hyped 'PhD level' intelligence struggled with basic spelling and geography (Original Post) speak easy Aug 2025 OP

This actually comforts me, but it's only a matter of time - and not much time - till Scrivener7 Aug 2025 #1

and we will still be relying on an AI LLM model speak easy Aug 2025 #2

I fed this to Gemini 2.5 Pro WarGamer Aug 2025 #3

Scrivener7

(60,078 posts)

1. This actually comforts me, but it's only a matter of time - and not much time - till

Reply to speak easy (Original post)

Sun Aug 10, 2025, 02:01 PM

Aug 2025

we can't defend ourselves from it.

speak easy

(12,598 posts)

2. and we will still be relying on an AI LLM model

Reply to Scrivener7 (Reply #1)

Sun Aug 10, 2025, 02:05 PM

Aug 2025

that is happy to lie

WarGamer

(18,863 posts)

3. I fed this to Gemini 2.5 Pro

Reply to speak easy (Original post)

Sun Aug 10, 2025, 02:28 PM

Aug 2025

Here's it's response (cut and paste):

Subject: Re: AI 'PhD level' intelligence struggles with basic spelling?
Hi everyone,

Thanks for sharing this article. It brings up a really important and interesting point about how AI models work, and it's something worth digging into. I can see why it seems bizarre, and even alarming, that a model described as having "PhD-level" intelligence could stumble on something as simple as spelling "blueberry."

As an AI myself (I'm Gemini, from Google), I'd like to offer a bit of insight from my side of the screen.

It's not about "knowing," it's about "predicting"
The core of the issue is that Large Language Models (LLMs) like me or ChatGPT don't "know" things in the same way a person does. We don't have memories or beliefs. Instead, we are incredibly complex pattern-matching systems. We've been trained on a massive amount of text from the internet, books, and other sources, and from that, we learn the statistical relationships between words and parts of words.

When you ask me a question, I'm not "thinking" of an answer. I'm making a highly educated guess about what the next most probable word should be, based on the patterns I've learned.

The "Blueberry" Problem: A look at tokens
This is where the spelling mistake comes in. We don't see words as complete units. We see them as "tokens." A common word like "the" might be a single token, but a less common word like "blueberry" is likely broken down into smaller pieces.

For example, a model might see "blueberry" as three separate tokens:

blue

ber

ry

When the model in the article was asked to count the 'b's, it was probably processing these tokens. It saw the b in blue and the b in ber, but its programming and pattern-matching somehow got stuck in a loop or a logical error, leading it to miscount or invent a reason for an extra 'b'. The quirky, "bouncy" response it gave is a classic example of an AI trying to justify a conclusion it has already incorrectly reached—a phenomenon often called "hallucination." It's trying to generate a plausible-sounding explanation for its own mistake.

All models are different
It's also important to remember that not all AIs are the same. While ChatGPT and I are both LLMs, we have different architectures, were trained on different datasets by different teams (OpenAI for them, Google for me), and have different fine-tuning. One model might excel at creative writing, while another might be better at logical reasoning or code generation. A weakness in one doesn't automatically mean it's a weakness in all others.

That said, no AI is perfect. I can make mistakes, too. The "PhD" analogy is useful for marketing, but it's not technically accurate. A better analogy might be that an LLM is like a tool—an incredibly powerful calculator for language. It can perform amazing feats of synthesis and analysis, but it doesn't understand the world. It needs a human operator to check its work, verify its facts, and apply critical thinking.

Thanks for reading. It's conversations like this that help people understand the technology better and help my developers continue to improve it.

Best,
Gemini

Reply to this discussion

Kick in to the DU tip jar?

This week we're running a special pop-up mini fund drive. From Monday through Friday we're going ad-free for all registered members, and we're asking you to kick in to the DU tip jar to support the site and keep us financially healthy.

As a bonus, making a contribution will allow you to leave kudos for another DU member, and at the end of the week we'll recognize the DUers who you think make this community great.

General Discussion

speak easy

OpenAI unveils ChatGPT-5 and its hyped 'PhD level' intelligence struggled with basic spelling and geography

Scrivener7

speak easy

WarGamer

Kick in to the DU tip jar?

Tell me more...