Cite
Notes
Only stored in your browser.
Attribution
Terminal Wrench: A Dataset of 331 Reward-Hackable Environments and 3,632 Exploit Trajectories
arXiv 2026
Jailbreaking in the Haystack
arXiv 2025
from 2 papers
aditi raghunathan
Ziqian Zhong
Alexander Robey
Chen Henry Wu
Ivan Bercovich
Ivgeni Segal
Kexun Zhang
Rishi Rajesh Shah