Cite
Notes
Only stored in your browser.
Attribution
AIRS-Bench: a Suite of Tasks for Frontier AI Research Science Agents
arXiv 2026
MLGym: A New Framework and Benchmark for Advancing AI Research Agents
arXiv 2025
Souper-Model: How Simple Arithmetic Unlocks State-of-the-Art LLM Performance
from 3 papers
Amar Budhiraja
Despoina Magka
Roberta Raileanu
Tatiana Shavrina
Yoram Bachrach
Alexis Audran-Reiss
Anton Protopopov
Bhavul Gauri
Jakob Foerster
Abhinav Moudgil