Cite
Notes
Only stored in your browser.
Attribution
GameDevBench: Evaluating Agentic Capabilities Through Game Development
arXiv 2026
The RealHumanEval: Evaluating Large Language Models' Abilities to Support Programmers
arXiv 2024
from 2 papers
Ameet Talwalkar
Alexander Wang
Arnav Yayavaram
Chris Donahue
David Sontag
Dennis Wei
Hussein Mozannar
Manish Nagireddy
Mohammed Alsobay
Prasanna Sattigeri