SUPER Expert
Fresh
SUPER is the first benchmark designed to evaluate the capability of LLMs in setting up and executing tasks from research repositories. SUPER aims to capture the realistic challenges faced by researchers working with Machine Learning (ML) and Natural Language Processing (NLP) r…
- Type
- RL Env
- Runtime
ORS- License
- unknown
- Size
- 45 tasks
- Published
- Mar 2026
- Canonical
- openreward.ai/jbragg/SUPER-Expert
Cite
Notes
Only stored in your browser.