Cite
Notes
Only stored in your browser.
Attribution
Precise Debugging Benchmark: Is Your Model Debugging or Regenerating?
arXiv 2026
Identifying the Risks of LM Agents with an LM-Emulated Sandbox
arXiv 2023
from 2 papers
Andrew Wang
Chris J. Maddison
Jimmy Ba
Miaosen Chai
Robin Jia
Shangshang Wang
Silviu Pitis
Song Bian
Tatsunori Hashimoto
professor
Wang Bill Zhu