MMLU

Fresh

Massive Multitask Language Understanding (MMLU) is a popular benchmark for evaluating the capabilities of large language models. It inspired several other versions and spin-offs, such as MMLU-Pro, MMMLU and MMLU-Redux.

Type: RL Env
Publisher: General Reasoning
Tags: Question Answering
Runtime: ORS
License: unknown
Size: 115700 tasks
Published: Jan 2026
Canonical: openreward.ai/GeneralReasoning/MMLU

Cite

Notes

Only stored in your browser.

Attribution

README: openreward.ai/GeneralReasoning/MMLU
Scores: OpenReward

Attribution policy →

Public scores on this env

1 vf-eval report across 1 model

1Qwen 3 Coder NextAlibaba87.73

Open the scoring view →