Cite
Notes
Only stored in your browser.
Attribution
PaperBench: Evaluating AI's Ability to Replicate AI Research
arXiv 2025
MLE-bench: Evaluating Machine Learning Agents on Machine Learning Engineering
arXiv 2024
from 2 papers
Evan Mays
Giulio Starace
James Aung
Jun Shern Chan
Leon Maksin
Oliver Jaffe
Tejal Patwardhan
researcher
Aleksander Mądry
Amelia Glaese
Benjamin Kinsella