0

Hendrycksmath

hendrycksmath evaluation environment

Domain
rl-env
License
unknown
Published
Sep 2025

Cite

Notes

Only stored in your browser.

Attribution

Leaderboard scores
prime-hub
Attribution policy →

Top score 63.3% by GPT-4.1 Mini - 1 model reporting (1 frontier)

Top models

1
HendrycksmathBar chart with 1 bar. Highest value: GPT-4.1 Mini at 63.3.
1 model

Related tools

1
View all

Implementations, trainers, datasets and scaffolds linked to this eval.

FAQ

What is Hendrycksmath?
hendrycksmath evaluation environment
What is the current top score on Hendrycksmath?
The top reported score is 63.3% by GPT-4.1 Mini, across 1 model reporting (1 from frontier labs).
How can a model improve its Hendrycksmath score?
Tools linked to Hendrycksmath on Sophon include Hendrycksmath RL Env (Community) - RL environments, datasets, and scaffolds that target this eval.
What license is Hendrycksmath under?
Hendrycksmath is available under unknown.