0

SUPER Expert

Fresh

SUPER is the first benchmark designed to evaluate the capability of LLMs in setting up and executing tasks from research repositories. SUPER aims to capture the realistic challenges faced by researchers working with Machine Learning (ML) and Natural Language Processing (NLP) r…

Type
RL Env
Runtime
ORS
License
unknown
Size
45 tasks
Published
Mar 2026

Cite

Notes

Only stored in your browser.

Contributors

1