Dawn Drain

Anthropic researcher; previously Microsoft Research; works on code models and language model training.

Role: researcher
Currently at: Anthropic
Twitter: Unknown
GitHub: github.com/dawn-drain
Scholar: scholar.google.com/citations
Papers: 7

Cite

Notes

Only stored in your browser.

Attribution

Affiliations & profile: scholar.google.com/citations

Attribution policy →

7papers

Authored papers

Training a Helpful and Harmless Assistant with Reinforcement Learning from Human Feedback

preprint

2022

Red Teaming Language Models to Reduce Harms: Methods, Scaling Behaviors, and Lessons Learned

arXiv 2022

2022

Discovering Language Model Behaviors with Model-Written Evaluations

arXiv 2022

2022

Toy Models of Superposition

arXiv 2022

2022

Constitutional AI: Harmlessness from AI Feedback

arXiv 2022

2022

CodeXGLUE: A Machine Learning Benchmark Dataset for Code Understanding and Generation

arXiv 2021

2021

Unit Test Case Generation with Transformers and Focal Context

arXiv 2020

2020

Affiliations

Currently at

Anthropic

researcher · frontier lab

Previously

Microsoftfrontier lab

Frequent co-authors

from 7 papers

Catherine Olsson

researcher

5 shared papers

Dario Amodei

CEO

5 shared papers

Jared Kaplan

co-founder / Chief Science Officer

5 shared papers

Nelson Elhage

researcher

5 shared papers

Sam McCandlish

founder

5 shared papers

Shauna Kravec

researcher

5 shared papers

Tom Henighan

researcher

5 shared papers

Tristan Hume

engineer

5 shared papers

Zac Hatfield-Dodds

researcher

5 shared papers

Amanda Askell

researcher

4 shared papers