David Duvenaud
- Papers
- 7
Cite
Notes
Only stored in your browser.
Authored papers
7Sycophancy to Subterfuge: Investigating Reward-Tampering in Large Language Models
arXiv 2024
Sleeper Agents: Training Deceptive LLMs that Persist Through Safety Training
arXiv 2024
Alignment faking in large language models
arXiv 2024
Residual Flows for Invertible Generative Modeling
residual-flows-for-invertible-generative-2
Explaining Image Classifiers by Counterfactual Generation
explaining-image-classifiers-by-1
Isolating Sources of Disentanglement in Variational Autoencoders
isolating-sources-of-disentanglement-in-1
Convolutional Networks on Graphs for Learning Molecular Fingerprints
convolutional-networks-on-graphs-for-learning-1
Affiliations
Frequent co-authors
10from 7 papers
Buck Shlegeris
Carson Denison
Ethan Perez
Evan Hubinger
Jared Kaplan
co-founder / Chief Science Officer
Monte MacDiarmid
Samuel R. Bowman
Fazl Barez
Nicholas Schiefer
Ricky T. Q. Chen