SUPER Expert

Fresh

SUPER is the first benchmark designed to evaluate the capability of LLMs in setting up and executing tasks from research repositories. SUPER aims to capture the realistic challenges faced by researchers working with Machine Learning (ML) and Natural Language Processing (NLP) r…

Type: RL Env
Tags: AI Research Tasks Machine Learning Engineering Machine Learning Engineering Tasks
Runtime: ORS
License: unknown
Size: 45 tasks
Published: Mar 2026
Canonical: openreward.ai/jbragg/SUPER-Expert

Cite

Notes

Only stored in your browser.

Attribution

README: openreward.ai/jbragg/SUPER-Expert

Attribution policy →

Contributors

Unknown