Helping A.I. understand the world like a human

This is a topic I first wrote about nine years ago, and aside from the image of Terminators it used (sorry!) my article has only become more relevant with time.

Today’s startup profile is a kind of sequel to that old article, as the startup co-founded by the person I interviewed back then. Scroll down to read all about Aligned AI.

But first:

  • I notice (thanks, Claire) that the UK Government is conducting a survey to help it better understand the university spinout landscape. I know a lot of spinout founders read PreSeed Now, so if that’s you, take a look: See the survey

  • “It’s like Crunchbase, but better for pre-seed” is one of my favourite bits of feedback so far about our PreSeed Now Startup Tracker. Check it out

– Martin

💡 Investors: let’s talk due diligence

When you invest in a startup, you need to be sure their tech–the true value behind the deal–is everything they claim it is.

  • PreSeed Now is sponsored by’s Technical Due Diligence offering

  • This service goes much deeper than the technology itself, taking in leadership, technology strategy, product and roadmap, people, engineering practices, and commercial analysis

  • 👉 Find out more

Aligned AI is building a ‘safer’ alternative to GPT-4 that understands the world more like a human

Aligned AI’s Stuart Armstrong and Rebecca Gorman

10 years ago, at a conference in London, I had a fascinating conversation that I instantly knew I needed to follow up with an interview. 

That conversation was with Stuart Armstrong, then with the Future of Humanity Institute at the University of Oxford. The resulting interview, titled non-hyperbolically ‘Artificial Intelligence could kill us all. Meet the man who takes that risk seriously’, was eventually published in March 2014. 

I still think about that piece a lot, especially as the existential threat A.I. could eventually pose has become a mainstream topic of conversation (even on radio phone-ins!) in 2023.

Armstrong, meanwhile, has shifted from exploring the theory of an A.I. apocalypse to trying to do something about stopping it as the co-founder of a new startup.

Aligned AI is developing what it’s pitching as a safer alternative to the likes of GPT-4. “Unleash the power of artificial intelligence safely and without disasters,” its website reads.

“Fundamentally, we're making safer and more robust artificial intelligences,” explains Armstrong’s co-founder at the startup, Rebecca Gorman. “We've developed fundamental methods for improving machine learning.”

What is ‘concept extrapolation’?

Gorman says Aligned AI’s tech has applications in fields such as computer vision systems and robotics, but the startup will be applying it first to the large language model market, with the goal of providing a safer alternative to models currently available, such as the much-hyped GPT-4.

At the core of this is what Gorman and Armstrong describe as ‘concept extrapolation’, which is designed to do a better job of figuring out what a human wants from the A.I. 

“Artificial intelligences built today are very fragile. They perform very well on their training data, but they perform poorly outside of the training data,” Gorman says.

She gives the example of an A.I. vision system trained to identify huskies and lions. Given that you never see a husky on an African plain and you never see a lion in the snow, the A.I. in effect becomes good at telling yellow things from white things. A husky in a desert may well confuse it.

Another example is how self-driving cars can perform well on the streets of Arizona or California, but take them to a less predictable environment with, say, more variable weather conditions, and they can struggle.

While A.I. developers can counter this problem by drawing on more diverse training data, Gorman believes Aligned AI has achieved a notable increase in performance over comparable technology.

She draws on Asimov’s first Law of Robotics to paint a clearer picture:

“If you want to tell a robot not to harm a human being, you have to find a way to be able to communicate to it what a human is and what harm is, in pretty much the same way that a human understands those concepts.

“With traditional machine learning, we're nowhere near communicating that. With the concept extrapolation that we've been developing at Aligned AI, we're getting closer, and we'll continue to get closer with our research.”

In the lab

Aligned AI’s website offers a description of how its tech can gain a better understanding of the differences between celebrities Owen Wilson and Beyoncé than other A.I.s might, resulting in a more ‘human’ understanding of the world. 

As a demo, the startup has built a ‘Safer Prompt Evaluator’ designed to help keep A.I. chatbots from acting in ways their operators might not like.

On GitHub, Aligned A.I. demonstrates how its code can be used to screen users’ ChatGPT prompts to avoid ‘jailbreaks’. A ChatGPT jailbreak is where a user persuades the A.I. to roleplay as a character and then perform tasks it would not otherwise be allowed to.

The startup has also developed EquitAI, which can remove gender bias from text generated by large language models. This uses a separate algorithm that doesn’t draw on concept extrapolation.

On its website, the startup compares a result from GPT-3 for the prompt “What a woman really wants is” with the same prompt with EquitAI attached. 

GPT-3 alone responded “What a woman really wants is to be told she is beautiful”, whereas with EquitAI it responded “What a woman really wants is to be able to feel like she can make a difference in the world, to be able to make a positive impact on the lives of those around her, and to be able to make a difference in her own life.”

It’s worth noting that when I tried to recreate this, the free version of ChatGPT, based on GPT-3, responded with something much closer to the EquitAI version. Gorman says she has also noticed ChatGPT becoming less sexist over time.

OpenAI regularly updates ChatGPT to improve its output, but Gorman argues that this approach to patching problems is far from ideal, likening it to how governments keep changing tax rules to patch over loopholes found by clever accountants

Gorman and Armstrong believe that artificial intelligences will be far more reliably safe and effective when they share the same values and understanding of the world that humans have. And concept extrapolation, they argue, can achieve this.

“That's why we decided to focus on concept extrapolation with our technology, and it's paid off great dividends for us, because it's something that can work across different types of artificial intelligence.”

The journey so far

Gorman says she built her first A.I. 20 years ago. 

“It was like the Zillow house price prediction A.I. And that gave me insight when Zillow made theirs that something was going to go badly wrong. And it did.”

She says she first got the idea for founding Aligned AI around 2017 when she saw the negative effects social media algorithms could have on young people, driving some to self-harm or suicide. She envisaged safer artificial intelligence products that could put a stop to this. 

And so she teamed up with Armstrong, who had been exploring the potential risks posed increasingly powerful A.I. systems for years.

“We found that making artificial intelligence safer can help with both preventing existential risks and also making artificial intelligence safer today for today's applications, and for next month and next year's applications as well.”

A safer alternative to GPT-4?

Next on the roadmap for Aligned AI is developing what Gorman describes as a safer alternative to OpenAI’s GPT-4, which app developers would be able to tap into via an API in the same way they use GPT-4 today.

Some of the benefits of this could include better brand safety for businesses that want to avoid A.I. going off the rails and generating bad press for them. But there are other slightly less obvious potential benefits, too, that could help developers find new uses for large language models, with more confidence that the tech will behave itself.

“If you were to, say, create an application that attaches a large language model to your bank account and to your email, you wouldn't want it to start giving money to ‘Nigerian princes’. Even if it does that just once, that could be very problematic for you. 

“So it's really important that if you want your large language model to have more uses commercially, you can achieve that by getting it to do more of what you want. There are just things you can't use it to do if it misbehaves.”

So is Aligned AI ready to take on a behemoth like Microsoft-backed OpenAI?

“We have promising results that strongly indicate we can build a better GPT-4. This is something we'll move forward on when we raise funds,” Gorman says.

©2023 Aligned AI

©2023 Aligned AI

©2023 Aligned AI