0

Chem Qa

Frontier

Chem-QA: tolerance-scored numeric chemistry problems (SingleTurnEnv)

Domain
rl-env
License
unknown
Published
Oct 2025

Cite

Notes

Only stored in your browser.

Attribution

Leaderboard scores
prime-hub
Attribution policy →

Top score 1.01 by GPT-4.1 Mini - 4 models reporting (4 frontier)

Score history

4
00.380.751.131.5Jul 24Sep 24Nov 24Jan 25Mar 25GPT-4o-miniGPT-4oGPT-4.1 Mini

Top models

4
Chem QaBar chart with 4 bars. Highest value: GPT-4.1 Mini at 1.
4 models

Related tools

1
View all

Implementations, trainers, datasets and scaffolds linked to this eval.

FAQ

What is Chem Qa?
Chem-QA: tolerance-scored numeric chemistry problems (SingleTurnEnv)
What is the current top score on Chem Qa?
The top reported score is 1.01 by GPT-4.1 Mini, across 4 models reporting (4 from frontier labs).
How can a model improve its Chem Qa score?
Tools linked to Chem Qa on Sophon include CHEM QA RL Env (Community) - RL environments, datasets, and scaffolds that target this eval.
What license is Chem Qa under?
Chem Qa is available under unknown.