Article

Article

A General Theory of Overhyped Statistical Machines

Jul 22, 2025

Fine. I know I said the world doesn't need any more articles on AI. However, I'd like to present a two-part explanation of my thinking: firstly, a simple thought experiment, followed by a bit of science to show why, of all the myriad large data statistical models available, LLMs were the one that made the world go bonkers.

I was also a bit rude about generative AI last time, and, should there be any truth to Roko's Basilisk, I don't want any future sentient AI overlords thinking I tried to prevent their ascendance. Plus, it's also a good opportunity to address some of the feedback from the last article.

Before we get to thought experiments and rebuttals, though, my view, such as it is, on "AI" is this:

  • We are experiencing a colossal bubble that will burst fairly soon. I can't say when it will burst because my feeling is that the more money there is at stake, the harder the bubble-deniers double down. My best guess is the next 12 months. And that is longer than others give it.

  • I am very much not _anti-AI_ in the general sense. All tech can be used for good or bad. The web has been slowly enshittified over the last 15 years or so, and I never ranted about HTTP or Javascript (actually, don't quote me on Javascript). So I don't plan to start ranting about LLMs in the same way I never ranted about Support Vector Machines or Collaborative Filtering.

  • What grinds my gears is when people talk about a technology without understanding it, and make claims about what it will do simply to fluff up some investors or be seen to be part of a popular movement. In fact, if you will allow me a short curmudgeonly moment, I'd rather people didn't say anything about technology they don't understand. If you don't know your eigenvectors from your jacobians, maybe don't go layering public proclamations on top.

That's it. I'm not especially radical. To be honest, I've had these thoughts for years, but was too much of a coward to voice them. There's no satisfaction in saying it's all going to collapse in on itself. If I had the choice, I'd rather it didn't. I lived through the dotcom bubble bursting, and tech in general got a bad name for years for no logical reasons at all.

Despite it being bleeding obvious the whole world was going online, by the time 2001 was ending, you couldn't get funding for an internet business no matter how good your idea (I know this because I was in a web start-up at the time).

It was stupid because markets are emotional.

Part One: Flowers in the Forest

See if you can feel this intuitively. Let's say you bought a cottage in the countryside. The garden is full of flowers. You don't know the first thing about flowers, but you have some friends who do. The thing is, these friends are not infallible. You invite Bob around and ask him to identify what you have growing in the garden. Bob is right 60% of the time. Understandably, as he identifies each plant, you have 60% confidence in his guesses.

So you invite Alice round. She's also 60% accurate, but her methods aren't the same as Bob's (maybe she focuses more on leaf structure while Bob is a petals and seeds man), which means the 60% she gets right is different from Bob's. If they both agree a plant is Fritillaria meleagris, then you'd probably be inclined to believe them. Not 100% because they're both wrong 40% of the time. But because they use different methods to come to their conclusion, it kind of feels like the odds are good.

What if they disagree? No probs. Here's Carol, who's also right 60% of the time and uses her impressive knowledge of plant heights and colours. Now you can get a majority answer to the question "Is this Fritillaria meleagris?"

And the likelihood that they are collectively correct isn't 60%. It's 65%. Wait. What?

Quick maths. Everyday man's on the block.

We have Marie Jean Antoine Nicolas de Caritat, a man whose credit cards must have been seven inches wide, to thank for this insight. He was the Marquis of Condorcet and this is known as Condorcet's Jury Theorem (he applied it generically to group decision making).

The substance is very simple: if you get a group of people together to make a decision (identify a flower) AND they are all better than average (>50%) at making that decision AND they approach the problem in different ways AND you have a lot of them, then they collectively make better decisions than any of them individually.

25 people who are 60% good at naming plants would be right 85% of the time. Amazing.

No number of guessers will ever be right _all_ the time, but more guessers mean better accuracy (to a point). I know the 85% figure is correct from this rather splendid interactive explanation of Condorcet's Jury Theorem under its more modern name, The Random Forest machine learning technique. The Random Forest incarnation of Condorcet's idea has been around since the early 1990s and became popular about 25 years ago. Postulated in 1785, formalised and made usable a couple of decades ago.

Come with me now on a journey through time and space. What if, say, 10 years ago, investors had got excited about Random Forests and ploughed billions of dollars into them? Well, we can speculate that with enough guessers and task-specific branching and gating, we'd probably end up with insanely robust classifiers that work across multiple domains. Maybe they'd go multi-modal and work across images, audio, and data. Maybe today you'd be opening up ChatRF and having the classification time of your life.

To be clear, this speculation is a mere fantasy ride. I'm ignoring some research that suggested Random Forests top out at a few thousand trees and a few billion data samples. But the same was said about neural networks until papers like the Lottery Ticket Hypothesis and Mixture of Experts opened more doors. If the money had poured in, then who knows what might have happened?)

What a world that might have been. But, pause for a second, do you think anyone would be saying Random Forests are the new paradigm? About to take millions of jobs and agentically do our shopping by classifying our fridge contents? Would these super guessers have grifting futurists churning out screeds of crap on LinkedIn and clammy Sam Altman on magazine covers?

No. No bugger would give a shit, like no bugger gives a shit about what you do with Postgres.

Because Random Forests don't deal in _language_. Spit out language rather than plant guesses, and our brains cannot help but see a soul in the machine. It's only LLMs that are burdened with the fantasy that if you _juuuust_ pay for one more order of magnitude of data or compute, you’ll get reasoning and agency and consciousness and AGI.

Is comparing Random Forests and LLMs fair? Well, I wouldn't use one for the job of the other, but in this thought experiment, it's what they have in common that is the important part. All powerful statistical engines, all machine learning, all AI, from forests of half-decent guessers to the transformers in LLMs, are bounded by the same truths. At small scale, you often see huge accuracy gains (1-25 guessers), then you see emergent capabilities, better generalisation, then diminishing returns, and then the demand for resources outstrips any gains.

They grow. They plateau. Then they compress.

It's all just statistical correlations between features and outcomes. Only those that output language (of the 50+ statistical techniques that make up "AI") are mistaken for minds.

So yes, it's not a leap to say LLMs are like Random Forests in spirit. LLMs just do their learning in a continuous, dense space and operate over sequences rather than bundles of sample data and decision trees.

If we saw this for what it is, and honoured the work done by thousands of thinkers and researchers over what is now hundreds of years, we'd use these statistical marvels (including synthetic text extruders) where they fit, without confusing linguistic fluency for intelligence, or scale for sentience. There is a universal law quietly at work in all statistical learning systems as they scale, and LLMs are not immune.

Part Two: Mistaken for Minds

So why do we think LLMs are special? And look, honestly, the first time I used ChatGPT, I thought it must be a very clever fake. The last time I had much to do with NLP was back in the days of embeddings, and that was a wild ride visualising language as a multi-dimensional space you could mathematically navigate.

It's because we are hard-wired for it.

Language is weird. If someone runs into a room shouting "fire!" it has a visceral effect. You get a shot of adrenaline and start planning your exit. But why? It's just four letters. They mean nothing on their own. Would you run if someone shouted "Huǒ"? That's Mandarin for fire. Same danger. Language matters, but not because of the characters we choose to encode it. It's the meaning we apply.

Patricia Kuhl, professor of speech and hearing sciences at the University of Washington, showed that babies do not learn language simply by hearing and repeating words in isolation (i.e what an LLM does). What they do instead is deeply social. They learn words (or rather sounds that they then map to words) and their meanings through interactive engagement with other humans, where the meanings are more apparent.

For example, when a baby watches television, language learning only occurs when a parent is actively co-viewing, interacting and responding. Language, on its own, is not inherently meaningful to the human mind without shared context and intent. You could play Firestarter by The Prodigy all day long, and no baby is getting scared. Throw Keith Flint (RIP) in for some interaction, and it'll quickly work out some meaning.

As adults, we mostly use spoken language to develop a picture of the mind or the context or the intent of the speaker ("is Keith going to eat me"). When the language is written, we also construct imagery in our minds.

Roland Barthes, the famous literary theorist, argued that once words are committed to the page, the author's intent becomes irrelevant. The author is, in essence, dead. In the case of the works of Shakespeare, that's not a figurative statement. Meaning is an emergent property of the reader's (or listener's) experience that arises from the interpretive act.

Words, in a sense, have their own life. They may not have a soul or consciousness as they sit on a page or get chopped up into tokens to be scoffed up by an LLM, but they have the capacity to evoke interpretation, connection, and thought.

And here's where I diverge from the pure anti-AI perspective. Whilst LLMs have no intelligence or ability to reason, they are not failed vessels of understanding. Their outputs may lack intention, but they can still be said to participate in meaning-making of a sort. It would be wrong to compare synthetically extruded text to the words of a dead author. Clearly, the latter was created by a force of agency and intention. But they both can resonate, provoke, confuse, and illuminate the reader as they complete the act of meaning. This shouldn't be a surprise because agency and intention (the best and the worst kinds) went into the training data.

LLMs will never produce high-quality writing or inspired writing or create new insight, except by random happenstance. But all the words they produce will invoke in us a response that could.

Overhyped Statistical Machines

Beautiful? Yes. Shimmering reflections of the dead, not-so-dead, and angrily litigious authors whose work went into the mincing machine? Also, yes.

Valuable in the right context? No doubt.

But anything more than that (on current models)? No. Most definitely no.

Like blockchain, another fascinating technology with high CPU demand and a very narrow application window, we probably need to let people get excited, try it out, realise either it doesn't do what they thought or that more appropriate tech already exists, and then deal with the consequences. In the meantime, please save every idiotic, baseless prediction and grandiose business realignment so you can send them back when the investors take a deep bath.

Meanwhile, here are three criticisms of the last article that I think represent commonly held views:

Criticism 1: You are calling this too soon

Maybe. I can't predict the future. That’s why I’m writing this and not buying a lottery ticket.

Maybe this is like "predicting that Friends would burn out after season 4" (thanks Richard). Arguably, Friends got better after season 4 when Emily left.

I feel like calling it now because of The Gap. There’s a likely uncomfortably large chasm between cost and price in the big AI service providers (OpenAI, Anthropic, xAI, Mistral, etc). The question is whether it can be bridged. Anthropic, as one example, is catastrophically unprofitable. They’ve raised about $15 billion from investors, made about $900 million and lost about $5.5 billion. I say "about" because no one really knows.

Most start-ups run at a loss in the beginning. It’s almost a badge of honour. But the gap eventually has to be closed. These companies all need to IPO at off-the-charts valuations to pay back investors, and then make a profit on products where inference costs rise with demand and training costs look like small country GDPs.

I’m calling it now because this feels like Silicon Valley season 4 and Erlich Bachman has left the show. I could absolutely be calling it too soon, but investors must be watching for exits. It only takes one to call it a bad idea and spook the rest.

Cursor's recent price hike was not a good omen. Ed Zitron called it:

"one of the most dramatic and aggressive price increases in the history of software, with effectively no historical comparison. No infrastructure provider in the history of Silicon Valley has so distinctly and aggressively upped its prices on customers, let alone their largest and most prominent ones, and doing so is an act of desperation that suggests fundamental weaknesses in their business models."

I’m also calling it now because this tech isn’t new. It may have reached popular awareness in late 2022, but its roots go back 20 years.

This is Game of Thrones season 8.

Criticism 2: It’s not a tool, it’s a platform for innovation

I hear this a lot. Yes, LLMs spew duff data. Yes, the investment got a bit heady. Yes, there are teething issues with the business model. But you're missing the point, right? See the bigger picture. This is like the internet. A foundation for entirely new ways of working. A dawn of a new era.

Alas, bollocks. A platform needs a substrate and composability. The internet had RFPs. The Web had HTTP. We got Facebook and Amazon because there was a coherent platform to build on. Also, which of the many AI/ML approaches are we weaving into this platform? Does it need an Enterprise Service Bus now? You can’t platform a scientific discipline. You need an abstraction based on a specification, not a daydream.

Criticism 3: What? About? All? The? Evidence?

Bubble schmubble. Platform schmatform. Who cares? What about the enterprises and projects and start-ups delivering value and revolutionising their businesses? What about the job losses? What about all the conference talks and LinkedIn posts? This many people can’t be wrong.

Look. I'm Mulder here. I want to believe. Yes, there are lots of pilots in big businesses. Some good evidence shows that LLMs can boost productivity in software delivery. But where are the CEOs putting their names and reputations behind it? Where are the independently verified case studies? When the dotcoms came, evidence was everywhere. Valuations went mad, but the tech worked.

According to Amara's Law, over time LLMs will probably be embedded in hundreds of products where the fit is good, cost is reasonable, and risk of error is low. But right now? Mostly noise.

And about those job losses: yes, it’s bleak. News like this and this is deeply discouraging. But the world economy is also in a tough place. Around the turn of the century, growth was about 8%, dipped as low as -4%, and only recently crawled near 4% again before tariff chaos hit. Tech companies overhired. Corrections are happening. It’s awful. But it's not because we made a machine that can guess the next word.

Summary

The problem might be this: within a family of genuinely interesting tools, tools that previously struggled to capture investor interest, one emerged that spits out language. Despite being powered only by probabilities and randomness, it is impressive. And useful. And, thanks to the Eliza Effect, it made otherwise sensible people a bit giddy. Once the money flowed in, the mess followed.

Could there still be a path forward? Hopefully yes. Smaller, domain-specific models for medicine or software development. General-purpose writing aids and summarisation tools. They won’t justify the hype or the investment, but they could support smaller, profitable businesses. They won’t replace jobs. But they might reshape them.

And through it all, these tools will need humans. To feed them. To use them. And to check their work.

Get insights & event updates straight to your inbox

Be the first to know about exclusive events, expert insights, and game-changing updates! Sign up now and stay ahead with insider access— delivered straight to your inbox.

Get insights & event updates straight to your inbox

Be the first to know about exclusive events, expert insights, and game-changing updates! Sign up now and stay ahead with insider access— delivered straight to your inbox.

Get insights & event updates straight to your inbox

Be the first to know about exclusive events, expert insights, and game-changing updates! Sign up now and stay ahead with insider access— delivered straight to your inbox.