Cite
Notes
Only stored in your browser.
Attribution
Model Editing with Canonical Examples
arXiv 2024
Representation Engineering: A Top-Down Approach to AI Transparency
arXiv 2023
Can LLMs Follow Simple Rules?
from 3 papers
Dan Hendrycks
director
Zifan Wang
Alex Mallen
Alexander Pan
Andy Zou
founder
Ann-Kathrin Dombrowski
Basel Alomair
Christopher D. Manning
David Karamardian
David Wagner