0

HLE Verified

Fresh

HLE-Verified is a systematically audited and reliability-enhanced version of the Humanity’s Last Exam (HLE) benchmark.

Type
RL Env
Runtime
ORS
License
unknown
Size
2500 tasks
Published
Feb 2026

Cite

Notes

Only stored in your browser.

Public scores on this env

1

1 vf-eval report across 1 model

Open the scoring view →