0

Mastermind

Frontier

Mastermind multi-turn game environment for Verifiers

Domain
rl-env
License
unknown
Published
Mar 2026

Cite

Notes

Only stored in your browser.

Attribution

Leaderboard scores
prime-hub
Attribution policy →

Top score 1.62 by gpt-oss-120b - 13 models reporting (9 frontier)

Score history

13
00.511.52Dec 24Feb 25Apr 25Jun 25Aug 25Llama 3.3 Instruct 70BGPT-4.1GLM 4.5 Airgpt-oss-120b

Top models

13
MastermindBar chart with 13 bars. Highest value: gpt-oss-120b at 1.6.
13 models

Related tools

1
View all

Implementations, trainers, datasets and scaffolds linked to this eval.

FAQ

What is Mastermind?
Mastermind multi-turn game environment for Verifiers
What is the current top score on Mastermind?
The top reported score is 1.62 by gpt-oss-120b, across 13 models reporting (9 from frontier labs).
How can a model improve its Mastermind score?
Tools linked to Mastermind on Sophon include Mastermind RL Env (Prime Community) - RL environments, datasets, and scaffolds that target this eval.
What license is Mastermind under?
Mastermind is available under unknown.