Owain Evans
Founder of Truthful AI; AI alignment researcher known for TruthfulQA, situational awareness, emergent misalignment, and subliminal learning.
- Role
- founder
- Currently at
- Truthful AI
- twitter.com/OwainEvans_UK
- GitHub
- github.com/owainevans
- Scholar
- scholar.google.com/citations
- Papers
- 12
Cite
Notes
Only stored in your browser.
Authored papers
12Emergent Misalignment: Narrow finetuning can produce broadly misaligned LLMs
arXiv 2025
Persona Vectors: Monitoring and Controlling Character Traits in Language Models
arXiv 2025
Activation Oracles: Training and Evaluating LLMs as General-Purpose Activation Explainers
arXiv 2025
Weird Generalization and Inductive Backdoors: New Ways to Corrupt LLMs
arXiv 2025
Tell me about yourself: LLMs are aware of their learned behaviors
arXiv 2025
Looking Inward: Language Models Can Learn About Themselves by Introspection
arXiv 2024
Connecting the Dots: LLMs can Infer and Verbalize Latent Structure from Disparate Training Data
arXiv 2024
The Reversal Curse: LLMs trained on "A is B" fail to learn "B is A"
arXiv 2023
Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models
TMLR
Forecasting Future World Events with Neural Networks
arXiv 2022
Teaching Models to Express Their Uncertainty in Words
arXiv 2022
TruthfulQA: Measuring How Models Mimic Human Falsehoods
ACL
Eval contributions
1Affiliations
Frequent co-authors
10from 12 papers
James Chua
Jan Betley
Anna Sztyber-Betley
Jacob Hilton
researcher
Stephanie Lin
researcher
Andy Arditi
Andy Zou
founder
Dan Hendrycks
director
Henry Sleight
Mantas Mazeika
researcher