Julian Michael
AI safety/alignment researcher at Meta; previously head of SEAL at Scale AI and postdoc at NYU on debate-based scalable oversight.
- Role
- researcher
- Currently at
- Meta Platforms
- twitter.com/_julianmichael_
- GitHub
- github.com/julianmichael
- Scholar
- scholar.google.com/citations
- Papers
- 11
Cite
Notes
Only stored in your browser.
Authored papers
11Inverse Scaling in Test-Time Compute
arXiv 2025
Alignment faking in large language models
arXiv 2024
Rapid Response: Mitigating LLM Jailbreaks with a Few Examples
arXiv 2024
Training Language Models to Win Debates with Self-Play Improves Judge Accuracy
arXiv 2024
Bias-Augmented Consistency Training Reduces Biased Reasoning in Chain-of-Thought
arXiv 2024
GPQA: A Graduate-Level Google-Proof Q&A Benchmark
COLM
We're Afraid Language Models Aren't Modeling Ambiguity
arXiv 2023
Language Models Don't Always Say What They Think: Unfaithful Explanations in Chain-of-Thought Prompting
language-models-don-t-always-say-what-they
Debate Helps Supervise Unreliable Experts
arXiv 2023
Asking It All: Generating Contextualized Questions for any Semantic Role
EMNLP 2021 11
Large-Scale QA-SRL Parsing
large-scale-qa-srl-parsing-1
Eval contributions
1Affiliations
Frequent co-authors
10from 11 papers
Ethan Perez
Samuel R. Bowman
David Rein
researcher
Henry Sleight
Jackson Petty
grad-student
Julien Dirani
Linda Petrini
Miles Turpin
Akbir Khan
Alane Suhr
professor