Skip to the content
  • Why Vertex
    • Expertise in Education
    • Your Trusted Partner
    • Humanitix Case Study
    • Give Back
    • Careers
  • Penetration Testing
  • ISO27001
  • Cyber Training
  • Solutions
    • Cyber Security Audit
    • Incident Response
    • Managed Services
  • News
  • Contact
  • Why Vertex
    • Expertise in Education
    • Your Trusted Partner
    • Humanitix Case Study
    • Give Back
    • Careers
  • Penetration Testing
  • ISO27001
  • Cyber Training
  • Solutions
    • Cyber Security Audit
    • Incident Response
    • Managed Services
  • News
  • Contact
LOG IN

Why Your AI Chatbot Sounds So… Average (And Why It Matters)

Have you ever been working with an AI chatbot and felt… underwhelmed? You ask it to generate something complex, and it returns a summary. You provide nuanced details, and it skips right over them, opting for a shorter, more generic answer.

If you’ve noticed this, you’re not alone. It often feels like the AI is taking shortcuts, and in a way, it is.

This phenomenon points to a core limitation of today’s general-purpose AI models: they are, by their very nature, “averaging machines.” They don’t create “great” or “exceptional” content in the way a human expert does. Instead, they regurgitate a highly sophisticated average of all the words they’ve ever been trained on.

If you want to understand the limits of AI, this “averaging” effect is the most important concept to grasp.

The Calculator for Words

Think of it like this: if a calculator’s job is to find the average, median, or mode of a set of numbers, an AI’s job is to find the “average” word or sentence from its training data.

When you give it a prompt, it’s not thinking in the way a person does. It’s not creating a novel argument. It is statistically predicting the most likely next word based on the context you gave it, weighed against the entirety of its massive (but finite) dataset.

What is the “most likely” word? It’s the most common, most-used, most average one.

This is why AI-generated text often feels a bit flat. It’s why chatbots tend to overuse the same common phrases (“In conclusion,” “It’s important to note,” “delve into”). They are statistically safe bets. This averaging effect is actively reducing the quality and depth of knowledge by smoothing out the very details, exceptions, and nuances that define expertise.

When “Good Enough” Becomes Dangerous

This isn’t always a bad thing. If you’re writing a simple blog post without specific details or generating ideas for a social media update, an “average” assist might be fine. It’s a great starting point.

The problem arises when we apply this “averaging machine” to tasks that demand specificity and perfection.

  • A PhD paper requires novel thought and deep, specific analysis—the exact opposite of an average.
  • A legal document hinges on precise, unambiguous language where a single missed detail can change the entire meaning.
  • A technical specification must be exact. An “average” measurement or a “summarised” instruction is not just low-quality; it’s completely non-functional.

Using a generic AI for these tasks isn’t just a shortcut; it’s a direct reduction in quality and a massive injection of risk.

Real-World Examples: The “Averaging” Effect in Action

This “averaging” isn’t just a theory; it’s a daily reality. The AI’s tendency to smooth over details and confidently guess the “average” answer leads to errors and “hallucinations.”

A recent, major investigation by the BBC found that leading AI chatbots are “failing at summarising news.” When testing models like ChatGPT, Google’s Gemini, and Microsoft’s Copilot, they found that:

  • Nearly half (45%) of AI-generated news summaries had “significant issues.”
  • One in five had “major accuracy issues,” including “hallucinated details and outdated information.”

Specific examples of this “averaging” effect leading to dangerously wrong answers include:

  • Factual Reversals: Google’s Gemini confidently stated that the UK’s National Health Service (NHS) does not recommend vaping as a method to quit smoking. The real and specific fact is the complete opposite: the NHS website explicitly does recommend it. The AI defaulted to a vague, “average” (and incorrect) public health stance.
  • Outdated Summaries: Both Copilot and ChatGPT incorrectly stated that Rishi Sunak and Nicola Sturgeon were still in office, long after they had stepped down. The AI’s knowledge had “averaged out” the most recent, specific facts with the older, more common data it was trained on.
  • Confident Hallucinations: When asked about two NASA astronauts temporarily stranded on the ISS, Gemini asserted that astronauts had “never been stranded in space” and told the human researcher they might be “confusing this with a sci-fi movie.”
  • Legal Disasters: In a now-famous case, lawyers submitted a legal brief to a court that was filled with completely fabricated legal citations and case law. Their AI chatbot had “averaged” the style of a legal brief and invented specifics that looked plausible but were entirely false.

In all these cases, the AI didn’t know the answer. Instead of saying so, it delivered a statistically “average” and confident-sounding answer, which was completely wrong.

How AI “Loses” Information (The Technical-ish Part)

You might wonder how this is possible given how much data these models are trained on. This is where the “averaging” becomes literal. The key concept to understand is lossy compression.

When you save a photo as a JPEG or music as an MP3, the file is made smaller by permanently throwing away “less important” information. You can’t get that data back.

An LLM is a form of lossy compression for all its training data.

Take Wikipedia as an example. The compressed text of English Wikipedia is around 24GB. Yet, there are powerful LLMs as small as 5GB that were trained on Wikipedia data (among many other sources).

How can the model be smaller than the data it was trained on?

Because the model doesn’t store the data; it compresses it through a process of statistical generalisation. During training (using methods like backpropagation), the model “squishes” every new piece of data into its existing network of parameters (or weights). It doesn’t memorise the fact; it learns the statistical relationship between the words used to describe that fact.

To fit that 24GB of specific fact into a 5GB model, details must be lost. The original, specific pathway to a single fact is “averaged” with all the other pathways around it. The model retains the “average” of the knowledge, not the specific, granular facts.

When you ask a question, the AI doesn’t “look up” the answer. It follows the most statistically probable path through its compressed network. If that path has been “averaged” too much, it leads to a hallucination. The model is essentially guessing, but with the confidence of all the data it has ever consumed.

The Big Questions We Aren’t Asking

This reality raises critical questions that we, as a society focused on the opportunities of AI, often skip over.

Is the AI’s “average” better and higher quality than the “average” quality of humans?

In some cases, yes. An AI’s summary of a topic may be more coherent than that of a poorly informed person. But this is the wrong comparison. We don’t rely on “average” humans for high-stakes tasks; we rely on experts.

Is the human error rate higher than the AI’s error rate?

This is the key question. A human expert might make a “human error”—a typo, a forgotten date. An AI makes an “averaging error”—it might confidently smooth over a critical legal distinction or hallucinate a fact because it “sounds” statistically probable. Which error has a bigger impact on quality?

Should using an AI bot be banned for certain items that need to be specific and can’t miss details, like laws, Wikipedia entries, or court documents?

As a new automation tool, we haven’t fully proven its limits. We are still in the honeymoon phase, focusing on the future promise rather than its current, very real limitations.

Beyond the Average: A Future Warning

The future is not all average. It is likely we will eventually have highly specialised, massive AI models. These models might be trained only on high-quality, verified legal data or peer-reviewed scientific papers, with enough parameters to avoid this drastic averaging effect. These models may one day truly exceed the “human average”.

But until that time, we are using generic AIs trained on the broad, messy, and often very “average” internet.

Be aware of what you are using. When you need a summary, an idea, or a simple draft, today’s AI is a powerful tool. But when you need “great” or “exceptional”—when you need precision, depth, and novel insight—you must be the human in the loop. Otherwise, you’re just asking for the average.

Contact Vertex Cyber Security leaders in Cyber Security for AI and Tech.

CATEGORIES

AI

TAGS

AI - chatbot - chatgpt

SHARE

PrevPreviousWhen the Cloud Goes Down: Lessons from the Major AWS Outage

Follow Us!

Facebook Twitter Linkedin Instagram
Cyber Security by Vertex, Sydney Australia

Your partner in Cyber Security.

Terms of Use | Privacy Policy

Accreditations & Certifications

blank
blank
blank
blank
  • 1300 229 237
  • Suite 10 30 Atchison Street St Leonards NSW 2065
  • 477 Pitt Street Sydney NSW 2000
  • 121 King St, Melbourne VIC 3000
  • Lot Fourteen, North Terrace, Adelaide SA 5000
  • Level 2/315 Brunswick St, Fortitude Valley QLD 4006, Adelaide SA 5000

(c) 2025 Vertex Technologies Pty Ltd.

download (2)
download (4)

We acknowledge Aboriginal and Torres Strait Islander peoples as the traditional custodians of this land and pay our respects to their Ancestors and Elders, past, present and future. We acknowledge and respect the continuing culture of the Gadigal people of the Eora nation and their unique cultural and spiritual relationships to the land, waters and seas.

We acknowledge that sovereignty of this land was never ceded. Always was, always will be Aboriginal land.