Humanity's Last Exam
Active
Humanity's Last Exam (HLE) is a multi-modal benchmark at the frontier of human knowledge, designed to be the final closed-ended academic benchmark of its kind with broad subject coverage. Humanity's Last Exam consists of 3,000 questions across dozens of subjects, including mathem
- Publisher
- Center for AI Safety (CAIS)
- Domain
- Knowledge
- License
- mit
- Published
- Feb 2025
- Notable for
- Benchmark for evaluating Knowledge.
Cite
Notes
Only stored in your browser.
Related tools
3Implementations, trainers, datasets and scaffolds linked to this eval.
Papers
1FAQ
- What is Humanity's Last Exam?
- Humanity's Last Exam (HLE) is a multi-modal benchmark at the frontier of human knowledge, designed to be the final closed-ended academic benchmark of its kind with broad subject coverage. Humanity's Last Exam consists of 3,000 questions across dozens of subjects, including mathem
- How can a model improve its Humanity's Last Exam score?
- Tools linked to Humanity's Last Exam on Sophon include HLE RL Env (Prime Intellect), WEB PY RL Env (Prime Community), WEB PY RL Env (Prime Intellect) - RL environments, datasets, and scaffolds that target this eval.
- What license is Humanity's Last Exam under?
- Humanity's Last Exam is available under mit.