Cite
Notes
Only stored in your browser.
Attribution
The Assistant Axis: Situating and Stabilizing the Default Persona of Language Models
arXiv 2026
Will AI Tell Lies to Save Sick Children? Litmus-Testing AI Values Prioritization with AIRiskDilemmas
arXiv 2025
from 2 papers
Christina Lu
Evan Hubinger
Jack Gallagher
Jack Lindsey
Jonathan Michala
Sharan Maiya
Sydney Levine
Yejin Choi
professor
Yu Ying Chiu
Zhilin Wang
researcher