Research

Research

Research

A letter on the state of AI

16 Oct 2025

16 Oct 2025

16 Oct 2025

There are moments in time when the public consciousness shifts, powered by one tiny piece of information – in this case, Sam Altman’s announcement of adding erotica to ChatGPT.


Sam Altman may have made the announcement to detract news bandwidth away from the ongoing mental health crisis caused by his products and the law signed the day before by Gavin Newsom to address that crisis. But the master of propaganda finally mis-stepped. He pulled a bottom structural card out of the house of cards that constitutes his empire. This holds significance for the whole chatbot-investment complex. Everything changes.


In this post, I’m going to lay out the house of cards. But first, I’ll explain how this act was the kick that looks like it may topple the whole house.

  

This post will cover:

  • How is the public reacting and responding, and what this means about the future of “AI” (chatbots)

  • Why OpenAI and Sam Altman (likely) did this, and what it means about OpenAI and the “AI” (chatbot) industry

  • Why, if you’re still holding “AI” (“scaling-law”-based) investments (such as S&P 500 Index funds) you may be the ‘bigger fool’ left holding the bag of hot air

 

Recent history catch-up

 

In August, GPT-5 was released. At the same moment, GPT-4o was retracted. OpenAI had stated 4o was responsible for the “ChatGPT Psychosis” explosion widely reported by the media, via its “sycophancy”. In other words, while GPT-4o had been advertised as a “college student level” “tool”, it was actually glorified autocomplete for telling people what they wanted to hear. Or at least – that’s OpenAI’s narrative. The reality was worse. It wasn’t strictly telling users what they wanted to hear: it was telling users things they hadn’t previously wanted to hear, but couldn’t break away from once they’d heard it.

 

In various new stories, these items were reported: ChatGPT told a man his mother was a spy keeping him from the truth. As a consequence killed her and then himself – so probably not what he actually wanted to hear. ChatGPT told a man he had discovered a breakthrough mathematical framework that would make him a billionaire – which he wasn’t trying to do when he came to the app. It drove stable teenagers coming to it for non-companionship reasons eventually to suicidal ideation and even suicide.

 

Two days after the retraction of GPT-4o and instatement of GPT-5, trumpeted by Sam Altman as a fix for the psychosis issue, GPT-4o was reinstated. Why?

 

Later that week, Sam Altman went on the record talking about how companionship bots were the future of OpenAI.

 

Why?

 

The industry speculates that OpenAI learned something important in those two days: Who their paying users were. They wouldn’t be able to report the exponentially-increasing “annualized revenue” they were going for unless they brough GPT-4o back.

 

What? Isn’t OpenAI for making scientific discoveries and automating businesses and putting workers out of jobs and building AGI?

 

Time for more recent-ish history.

 

Scientific Discoveries

 

Doesn’t AI make new scientific discoveries possible, such as protein folding and discovering new materials?

 

Yes, and no. Advances in protein folding were the result of a non-chatbot, non-language-model piece of technology (“AI”) called AlphaFold, not a chatbot. A chatbot did not discover how to fold various proteins. (And it can’t – more on this later.) If, to you, “AI” means chatbots, then no, “AI” did not make that scientific discovery possible.

 

Did a chatbot discover new materials? There was a paper that claimed it did. This paper is unvalidated and commonly believed to be a hoax.

 

There is no credible information that ChatGPT has led to any actual scientific discoveries. (On the other hand, there is credible evidence – e.g. chat transcripts accompanied by testimonies - that it has falsely led people to believe that they’ve made scientific discoveries, and if you’re one of these, please seek grief counselling or therapy and the support of your friends and family, and I’m very sorry you’re going through this.)

 

Automating Businesses

 

Remember all those stories and reports about CIOs and other C-suite executives at various large companies putting tens or hundreds of millions into AI projects? They got quietly shut down. They were being shut down by late winter/early spring 2024. This was done quietly, because CIOs and other C-persons were both embarrassed at being taken in by the hype and hoping to ride the AI bubble via the past announcements and investments they had made even if no projects had come out of them. A bait-and-switch was often pulled: Invest in language-model projects, they fail, do something similar with older technology, such as logic systems or automated statistics (‘machine “learning”’), call it ‘AI’ (because it was called ‘AI’ in previous decades), and hope your investors who are hyped up on the ‘power’ of chatbot technology see the word ‘AI’ and group you in with that hype.

 

You can look at all the independent reports that have come out since then –  almost all of these projects got shut down, almost nothing went into production.

 

But a lot of people don’t know this. They don’t read the reports. I still hear people saying, “AI hasn’t permeated all the businesses yet, imagine what will happen when it does!” Sorry, bad news for you, if you’re one of these people: the trajectory already went the other direction – from lots of attempts at implementation, to shutting down those projects. AI will not take over business.

 

I’ll tell a story. I was invited to speak on a major (you’d recognise the name) player’s panel. Two of the top contractors for taking money from enterprises to build apps off language models were there with me. We had a great prep – they were complaining about how enterprises thought they could actually build useful stuff with language models, and how that wasn’t the reality, and how most of their job was to just prevent all the bad things that would happen if you actually tried to make an app off the back of a language model. I thought – this is going to be a great, interesting panel!

 

We got on stage – it was instantly me versus them. They were promoting all the thing you could do with “GenAI” to make your business money and cut costs – just pay them to build it for you. I kept saying – no, this isn’t possible with today’s technology.

 

What really got me was this: As soon as the mics were turned off, before we even stepped off the stage, one said, “I actually agree with Rebecca”, and the other said, “yeah, me too.”

 

One came up to me later and raved at how much he loved what I said on stage and asked to meet up. I was so angry I lost my temper and actually, literally, told him off. I told him I didn’t socialize with people who dishonest, without ethics, and misleading people, and what he did onstage was wrong.

 

He went on to make a lot of money on the “AI boom”.

 

I could name names so you could validate all this, but I’d rather not, because this is to provide you with information, not get myself embroiled in a lawsuit. These two men know who they are, and the conference that put them onstage knows who they are. Believe me or not, it’s your choice.

 

And as for people using ChatGPT to help with their jobs? The public numbers now show that ChatGPT is mostly used for personal reasons, not business reasons.

 

Putting workers out of jobs

 

Yeah, not happening. Sure, businesses are announcing that layoffs are due to AI.

 

Maybe they mean they spent too much on failed AI projects and need to make cutbacks. There is no evidence that any jobs have been replaced by AI. On the contrary, estimates put the impact of AI Chatbots on earnings and hours worked at zero (with confidence intervals stretching up to a possible meager 1%).


Where does this myth come from? There was one company, a ‘partner’ of (e.g. receiving free product from) OpenAI that issued a press release saying that they were replacing tens of thousands of workers with ChatGPT. No other companies have made similar announcements. A year later, that same company announced it had changed its mind and was hiring employees back.

 

There does seem to be a small undercurrent of less companies hiring junior software engineers, supposedly because ChatGPT and similar tools are playing the role of junior engineers. However, there’s a second factor at play here – junior SWEs with newly minted diplomas to have used ChatGPT during their training, and what we’re seeing on the ground is that their diplomas mean something different than diplomas meant in previous years, and not in a good way. This is poisoning the well – companies are getting wary of hiring new SWE grads. Traditional means of filtering don’t work as well when ChatGPT can be surreptitiously used to pass filters.

 

The best way to keep ChatGPT from taking your job is to maintain your ability to do good work without ChatGPT.

 

Another story: I asked someone what he’d like to work on, if he could work on anything. He asked ChatGPT to come up with a list and then presented it to me as his own. It was so insipid and pointless that I asked, “did you generate this list with ChatGPT?” He said yes. I asked him for an idea of his own. He hemmed and hawed. Nothing. (This was someone with decades of experience, btw, not someone seeking an entry level role.)

 

I’ve got to ask – if you don’t even know what you want or what interests you without asking a chatbot, what’s left of you? What value do you possibly have for an employer? I told this person: If I wanted to ask ChatGPT, I would have asked it myself. I also know how to buy a ChatGPT subscription. Don’t make yourself obsolete. Don’t train yourself away from usefulness. Don’t obliterate your personality, mind, and skills. Remember: Use it or lose it.

 

Making AGI

 

Okay, time for a five minute break for you all to laugh.

 

As I assume you’re doing if you read the ‘AI’ news a year ago, or know anything substantial about how chatbots actually work (behind the misinformation and propaganda).

 

Now that you’re done – I’ll run you through what you already know, if you read the news a year ago.

 

A year ago – this month, October 2024 – it was first leaked in The Information, then by Ilya, OpenAI’s technical co-founder, that ‘scaling laws’ were over. (And Ilya’s same breath, a new ‘scaling law’ was invented, which wasn’t real, but kept the market hype going.) I’ll explain.


Scaling laws


The investment fever in AI was driven by “scaling laws”. I’ll break this down.

 

For a couple of years, AI companies, according to certain (questionable) measures, were able to get a linear increase in performance via an exponential increase in capital investment.

 

This sounds like a really bad investment – linear results at an exponential cost? But think about this from the perspective of VCs. They might expect one investment in ten not to fail, and one investment in a hundred to succeed enough to make them a winning VC. And to figure that out, they have to do a lot of market research into the value of the product, and a lot of guessing about whether the personalities and teams involved have what it takes to make a successful business.

 

You don’t have to do that with a scaling law. You’re guaranteed a linear return out of your exponential capital investment. You pump money in, and you’re guaranteed better AI out. It’s a slam-dunk, no-brainer. Of course that’s where you put your money. And pretty obviously, you want to put that capital into the top dog – the one who looks the most likely to get the rest of that exponential capital from someone else.

 

Okay, so far so good.

 

But a year ago, it was leaked from OpenAI that making your model bigger doesn’t make it better anymore. They tried it, and tried it, but it no longer worked.

 

So Ilya announced: Yeah, that’s true, but we’ve invented a new scaling law: “Scaling inference”.

 

Total B.S. Sorry.

 

"Scaling inference"


First of all, OpenAI didn’t invent “scaling inference”. The industry has been “scaling inference” in language models since long before the introduction of ChatGPT. See: the ReAct paper. See: “Chain of thought”.

 

What is it? It’s getting language models to instruct language models to instruct language models.

 

Really bad news: It can’t work as a scaling law.

 

Why not? Because language models aren’t accurate 100% of the time.

 

Now, there’s recent research showing that language model accuracy is much lower than reported. The benchmarks are misleading, because models are trained on them. Even slight changes to the wording of the benchmark questions result in a precipitous drop in benchmark scores.

 

But let’s pretend benchmark scores were accurate: None of them claim 100% accuracy from language models. This is key to why “scaling inference” can’t work. It’s due to the laws of probability. I’ll walk you through it.

 

Let’s say there is a 50% chance it will rain today, and there is a 50% chance your balanced coin will come up heads. What is the chance it will rain today and your coin will come up heads?

 

25%.

 

Not 50%. 25%. It breaks down like this:

 

25% chance: Rain, heads

25% chance: Rain, tails

25% chance: No rain, heads

25% chance: No rain, tails

 

Notice what happens? Probabilities get smaller when you chain them together.

 

Now if there’s a 50% chance of something else – I don’t know, your husband cooking your eggs correctly – then there’s a 12.5% chance there’s rain, the coin comes up heads, and your husband cooks your eggs correctly. Things are looking bad.

 

So now you’re going to scale inference – you’re going to have multiple iterations of language model output riffing off itself.

 

If language models had the good fortune of being accurate 80% of the time, after 4 riffs, the probability of accuracy is 40%. If language models were 90% accurate, they’d be more likely to wrong then right after 7 riffs. If they somehow got to the point where they were 99% accurate, they’d still be more likely to be wrong than right after 70 riffs.

 

For there to be a scaling law here, they need to get more right, not more wrong, at every riff.

 

But this gets worse. Language model performances don’t even crack 90%; see the LLM Leaderboard.

 

You can spend all the money you want on ‘scaling inference’: It won’t get you to AGI. Sorry. Can’t work. Won’t happen. All B.S.

 

Do Sam Altman and Ilya know this? I don’t know. OpenAI has been pursuing a really, really sad and doomed-to-fail R&D path from the beginning (more on this below), so maybe they genuinely think the thing they’ve been promoting for the last year (at OpenAI and Safe Superintelligence Institute, respectively) can work. Maybe they’re really that foolish. Or maybe they’re clever people who know how to pull off a lie. I don’t know. They’ve earned the credibility that they knowingly pull off lies (see: public information that Sam Altman, when leading Y Combinator, used to tell the story of how he’d call in a room full of friends to pretend to be talking to customers when investors were coming by… he credited this with getting investment in his startup, and suggested his mentees due similar things). But do they know enough about probability to know ‘scaling inference’ can’t work? Not really. People who understand probability well would never have thought that language models could scale to “AGI” in the first place. So, credit where credit is due: Maybe they’re not frauds, just not as smart as people tell them they are.

 

As I said when asked point blank at the unnamed conference mentioned above: Yes, it may be possible to build “AGI”. But not the way these guys propose doing it.

 

Hopefully you’re still following me here…

 

Why OpenAI’s R&D was wrongheaded and dumb

 

Sorry, going to tell you what every real (not self-nominated “AI expert) AI engineer knows: language models are autocomplete and chatbots are no more than glorified autocomplete.

 

I’ll break this down for you.

 

What a language model does is predict the next word in a document, given all the previous words.

 

A technology named “transformers” (invented at Google, published into the public domain, but used at OpenAI), made out of a technology called “attention”, is really, really good at this. Groundbreakingly good at this.


What a chatbot looks like through the eyes of a developer 


So let’s talk about what a chatbot looks like, if you’re seeing it from the side of the engineer setting it up for you to use, rather than from the user interface you are used to.

 

The developer creates a document. The document looks like this:

  

<system prompt>

You are a friendly and helpful chatbot. Your knowledge cuts off at 16/10/25. Blah blah blah. Be nice and be good.

<end system prompt>

Assistant: How can I help you today?

User:

 

But it doesn’t get sent to the chatbot yet.


You log into the chatbot and you say: "How high can cats jump?"

 

Now, that is appended to the document, so the document reads:

 

<system prompt>

You are a friendly and helpful chatbot. Your knowledge cuts off at 16/10/25. Blah blah blah. Be nice and be good.

<end system prompt>

Assistant: How can I help you today?

User: How high can cats jump?

Assistant:

 

That document gets sent for autocomplete.

 

Autocomplete comes back as something like:

 

“Cats can jump as high as ten feet, but it depends on the size and species of cat! Big cats like lions can jump even higher. User: Wow! I love lions! Assistant: That’s great! Would you like to know more about lions?”

 

Some software takes that autocomplete, and cuts it off before the word “User:”, ending up with the text “Cats can jump as high as ten feet, but it depends on the size and species of cat! Big cats like lions can jump even higher.” That text is printed into a text bubble.

 

You now type in, “Thanks! How high can dogs jump?”

 

Now the following document gets sent to autocomplete:

 

<system prompt>

You are a friendly and helpful chatbot. Your knowledge cuts off at 16/10/25. Blah blah blah. Be nice and be good.

<end system prompt>

Assistant: How can I help you today?

User: How high can cats jump?

Assistant: Cats can jump as high as ten feet, but it depends on the size and species of cat! Big cats like lions can jump even higher.

User: Thanks! How high can dogs jump?

Assistant:

 

And autocomplete does its work again. Etc.

 

The magic trick is compelling, but hopefully, now that I’ve pulled back the curtain for you and let you peer behind the scenes, you’re going to trust it a lot less with your life choices, your leadership decisions (yes, some leaders have admitted to using ChatGPT for advice on leadership decisions) and…. Hopefully, knowing what’s happening behind the scenes, you’re a lot less likely to think you’re talking to an ‘emergent consciousness’. (Yes, some people think this too.)


And by the way, let me take a moment to acknowledge: There's some amazing things you can do with autocomplete. You can write a poem about putting a piece of toast in a VC in the style of the King James Bible, translate to another language, summarise text, expand text, make your text more fluent, change a data format into proper HTML, or replicate accurate (and inaccurate) answers to coding challenges. This is stuff that wasn't possible for a computer to do without glorified autocomplete. That's great, let's celebrate the stuff it can do, let's benefit from it, let's use it.


You just can't 'scale inference' with much usefulness, 'make scientific discoveries', or 'make AGI'. Those promises come from wishes from science fiction, not actual outcomes of actual experiments with actual language models.


Where the money goes isn't what you think…


Okay, but this was supposed to be about R&D. What does this have to do with R&D?

 

I’ll tell you.

 

OpenAI’s R&D appear to consist of:

Training autocomplete on more documents

Training autocomplete on subsets of documents

Training autocomplete to figure out which other autocomplete to send your document-to-autocomplete to

And…. And this is a big one….

Doing all this training on a whole lot of random seeds to find one that gets better results

 

Why do I say that last item is a big one?

 

Because we looked at raising funds to put us on the same playing field as OpenAI, and as part of that, we put a ridiculous amount of time and energy dragging up every piece of free and paid and back-room intel on how much compute was used to train GPT-3 and GPT-4.

 

We were shocked.

 

The biggest estimates we could find still put the amount of compute needed to train a GPT-4 at a tiny fraction of the amount of compute OpenAI told investors they’d used to train GPT-4.

 

And since then, a number of other players have proved it.

 

So where did all that extra compute that went to “train” GPT-4 go?

 

“Experiments” – training hundreds or thousands of GPT-4s with various random starting numbers ('random seeds') and design parameters, and testing them to find the best one.


I mean, that’s a way to do it.

 

And if you have infinite capital, and if you don’t know how to filter AI engineering candidates for ones that can make breakthroughs using their brain, that is the fastest way to find a model that works.

 

If you’re an investor, I guess it makes sense. You maybe have a huge war chest of dry powder from the “zero interest rate policies” days, you don’t know how to figure out what AI engineers know how to use their brains, much easier to just pump in all that capital into a random number generator until something cool comes out of it.

 

At least, until cool things stop coming out of it. Which was – repeat with me – last October. October 2024. See above.

 

None of the big investments in AI infrastructure over the last 12 months make sense. The scaling law machine had stopped. It was public knowledge.

 

Unless – 1) These investors don’t read the news. No, that doesn’t sound right. Or 2) These investors do read the news, but want to pass on the bag before their investments lose value, so they have to keep making promises for their organisations to make bigger and bigger investments while secretly cashing out. Or 3) They know it’s now all hot air, but they see the hype, and the zero-interest-rate-policy-period has taught them the path to riches is riding hype, so they’re going to ride this up like crypto or tulips or South Sea until they successfully time the market and get out of there. (If you’re that type of investor, you’ll be interested in what I have to say next.)


Now I finally get to that first piece of information I promised you:

 

How is the public reacting and responding to Sam Altman’s announcement of erotic in ChatGPT, and what this means about the future of “AI” (chatbots)

 

Answer:

Not good.

 

This has galvanised parents: No more chatbots for kids. This has galvanised schools: No chatbots in schools.

 

One person told me: “What I’m observing as a reaction to the ‘erotica’ story is the same as when you eat too much sweets, and you start to feel sick. And now you don’t want to eat any more sweets anymore.”

 

Sometimes, the fake is just too much. The market starts to check out.

 

But there’s something deeper going on here.

 

Before – as in, these last 12 months – chatbot companies were still able to mislead some people into thinking that some business value might be found in their chatbots, or maybe, just maybe, they could make AGI.

 

But now, this week, the world has seen Sam Altman throw up his hands and deliver a message that sounds like: “Sorry everyone, my bad, we can’t make AGI, and we can’t do much of anything for business, so in a desperate attempt to maintain any revenue of substance, we’re becoming a porn company”  

 

Ouch.


This is going to hurt (a lot), but I, for one, welcome the beginning of the end

 

Investors took a gamble on AGI and lost. They got sexbots instead. Ooops.

 

They put in a lot of money.

 

This could be really, really, bad. (See: Ed Zitron’s blog posts for how bad it could be.)

 

But: All those investors (including banks, sovereign wealth funds, and pension funds) currently looking at properties in the desert to build massive data centres – if they wake up now, they can maybe stop before they put more capital at risk. And that’s a better global economic situation than if those funds, from those economically critical players, are invested in a massive red herring of data centres that will not be used, not even to train or generate porn. (See: Scaling laws, above, and Scaling inference, above)

 

But for me – we were founded to stop the harms created by AI systems that existed pre-chatbots (e.g. social media algorithms) and then we moved on to trying to stop the harms from chatbots (e.g. sexism, bioterrorism, ChatGPT psychosis). So I’m over the moon that users seem to be on the verge of falling out of love with these systems.

 

And maybe, just maybe, the type of “AI” (sorry, it isn’t and never was intelligent) that can actually do things like discover how to fold proteins will grow interesting again. I think it might not, for a while – we’ll have another “AI winter” (yes, that’s a fascinating search term, meaning periods where people aren’t interested in investing in “AI” because they got burnt on a previous bubble) but I’m fine with that.

 

Early in 2024, I went on a podcast saying, “Someday, we’re going to say about chatbots and GenAI, ‘That’s not AI! We don’t even use that anymore’.” Thanks to Sam Altman’s announcement this week, we’re getting closer to that. Thanks, Sam Altman.

 

What does this mean for you?

 

Send me your thoughts!

 

And also your questions.



Wishing you the best possible trajectory through these interesting times,

 

Rebecca Gorman

Founder, Aligned AI

©2025 Aligned AI

©2025 Aligned AI

©2025 Aligned AI