Cite
Notes
Only stored in your browser.
Attribution
SWE-Bench Pro: Can AI Agents Solve Long-Horizon Software Engineering Tasks?
arXiv 2025
MoReBench: Evaluating Procedural and Pluralistic Moral Reasoning in Language Models, More than Outcomes
from 2 papers
Bing Liu
Andrew Park
Brad Kenstler
Brandon Handoko
Charles Ide
Chetan Rane
Christina Q Knight
Edwin Pan
Florence Bacus
Harry R. Lloyd