Cite
Notes
Only stored in your browser.
Attribution
Inverse Scaling in Test-Time Compute
arXiv 2025
Alignment faking in large language models
arXiv 2024
from 2 papers
Ethan Perez
Julian Michael
researcher
Akbir Khan
Alexander Hägele
Andy Arditi
Aryo Pradipta Gema
Beatrice Alex
Benjamin Wright
Buck Shlegeris
Carson Denison