Dawn Drain
Anthropic researcher; previously Microsoft Research; works on code models and language model training.
- Role
- researcher
- Currently at
- Anthropic
- Unknown
- GitHub
- github.com/dawn-drain
- Scholar
- scholar.google.com/citations
- Papers
- 7
Cite
Notes
Only stored in your browser.
Authored papers
7Training a Helpful and Harmless Assistant with Reinforcement Learning from Human Feedback
preprint
Red Teaming Language Models to Reduce Harms: Methods, Scaling Behaviors, and Lessons Learned
arXiv 2022
Discovering Language Model Behaviors with Model-Written Evaluations
arXiv 2022
Toy Models of Superposition
arXiv 2022
Constitutional AI: Harmlessness from AI Feedback
arXiv 2022
CodeXGLUE: A Machine Learning Benchmark Dataset for Code Understanding and Generation
arXiv 2021
Unit Test Case Generation with Transformers and Focal Context
arXiv 2020
Affiliations
Previously
Frequent co-authors
10from 7 papers
Catherine Olsson
researcher
Dario Amodei
CEO
Jared Kaplan
co-founder / Chief Science Officer
Nelson Elhage
researcher
Sam McCandlish
founder
Shauna Kravec
researcher
Tom Henighan
researcher
Tristan Hume
engineer
Zac Hatfield-Dodds
researcher
Amanda Askell
researcher