/ ABOUT US
/ BLOG
/ RESEARCH PAPERS
/ IN THE NEWS
/ CANVAS
/ ABOUT US
/ BLOG
/ RESEARCH PAPERS
/ IN THE NEWS
/ CANVAS
18 Feb 2026
Aligned AI comes out of stealth with powerful calm AI: the Canvas digital ecosystem
Aligned AI comes out of stealth with powerful calm AI: the Canvas digital ecosystem
16 Feb 2026
Research Paper: *One* Good Game in 400: LLMs Can Describe Chess Rules But Just Can't Follow Them
Research Paper: *One* Good Game in 400: LLMs Can Describe Chess Rules But Just Can't Follow Them
16 Oct 2025
A letter on the state of AI
A letter on the state of AI
3 Sept 2025
Chatbots rephrased: from ''You don't need anyone else'' to ''Deep breathing can help''
Chatbots rephrased: from ''You don't need anyone else'' to ''Deep breathing can help''
27 Aug 2025
Research Paper: AI Chaperones Are (Really) All You Need to Prevent Parasocial Relationships with Chatbots
Research Paper: AI Chaperones Are (Really) All You Need to Prevent Parasocial Relationships with Chatbots
19 Aug 2025
System prompts don't defend against jailbreaks
System prompts don't defend against jailbreaks
17 Aug 2025
Ouch - LLMs that don't feel pain
Ouch - LLMs that don't feel pain
16 Aug 2025
Why did Grok turn into MechaHitler?
Why did Grok turn into MechaHitler?
23 Jul 2025
Do we do better with LLMs - or do they delude us into thinking so?
Do we do better with LLMs - or do they delude us into thinking so?
19 Mar 2025
Go home GPT-4o, you’re drunk: emergent misalignment as lowered inhibitions
Go home GPT-4o, you’re drunk: emergent misalignment as lowered inhibitions
12 Feb 2025
Research Paper: Defense Against the Dark Prompts: Mitigating Best-of-N Jailbreaking with Prompt Evaluation
Research Paper: Defense Against the Dark Prompts: Mitigating Best-of-N Jailbreaking with Prompt Evaluation
28 Sept 2023
Research Paper: CoinRun: Overcoming goal misgeneralisation
Research Paper: CoinRun: Overcoming goal misgeneralisation
19 Jun 2023
Concept extrapolation: Teaching AI systems to ‘think’ in human-like concepts
Concept extrapolation: Teaching AI systems to ‘think’ in human-like concepts
13 Sept 2023
Using fAIr to measure gender bias in LLMs
Using fAIr to measure gender bias in LLMs
16 Apr 2022
Concept extrapolation for hypothesis generation
Concept extrapolation for hypothesis generation
1 May 2022
ACE for goal generalisation
ACE for goal generalisation
24 Aug 2023
ACE mitigates simplicity bias
ACE mitigates simplicity bias
19 Jun 2023
Concept Extrapolation: A Conceptual Primer
Concept Extrapolation: A Conceptual Primer
1 Mar 2023
EquitAI: A gender bias mitigation tool for generative AI
EquitAI: A gender bias mitigation tool for generative AI
6 Dec 2022
Creating a prompt evaluator to prevent LLM jailbreaking
Creating a prompt evaluator to prevent LLM jailbreaking
4 May 2022
Research Paper: Missing Mechanisms of Manipulation in the EU AI Act
Research Paper: Missing Mechanisms of Manipulation in the EU AI Act
22 Feb 2022
Research Paper: The importance of preference change: A call for a coordinated multidisciplinary AI research
Research Paper: The importance of preference change: A call for a coordinated multidisciplinary AI research
28 Feb 2022
Research Paper: The dangers in algorithms learning humans' values and irrationalities
Research Paper: The dangers in algorithms learning humans' values and irrationalities
9 Sept 2021
Research Paper: Sigmoids behaving badly: why they usually cannot predict the future as well as they seem to promise
Research Paper: Sigmoids behaving badly: why they usually cannot predict the future as well as they seem to promise
J
o
i
n
t
h
e
M
a
i
l
i
n
g
L
i
s
t
Name
Email
Submit
©2025 Aligned AI
J
o
i
n
t
h
e
M
a
i
l
i
n
g
L
i
s
t
Name
Email
Submit
©2025 Aligned AI
J
o
i
n
t
h
e
M
a
i
l
i
n
g
L
i
s
t
Name
Email
Submit
©2025 Aligned AI