Cite
Notes
Only stored in your browser.
Attribution
Sleeper Agents: Training Deceptive LLMs that Persist Through Safety Training
arXiv 2024
Learning to Compress Prompts with Gist Tokens
NeurIPS 2023 11
STaR: Bootstrapping Reasoning With Reasoning
arXiv 2022
from 3 papers
Adam Jermyn
Amanda Askell
researcher
Ansh Radhakrishnan
Buck Shlegeris
Carson Denison
Cem Anil
Daniel M. Ziegler
David Duvenaud
Deep Ganguli
Eric Zelikman