0

Mastermind

Frontier

Mastermind multi-turn game environment for Verifiers

Domain
rl-env
License
unknown
Published
Oct 2025

Cite

Notes

Only stored in your browser.

Attribution

Leaderboard scores
prime-hub
Attribution policy →

Top score 1.62 by GPT-5 - 12 models reporting (8 frontier)

Score history

12
00.511.52Dec 24Feb 25Apr 25Jun 25Aug 25Llama 3.3 Instruct 70BGPT-4.1GLM 4.5 AirGPT-5

Top models

12
MastermindBar chart with 12 bars. Highest value: GPT-5 at 1.6.
12 models

Related tools

1
View all

Implementations, trainers, datasets and scaffolds linked to this eval.

FAQ

What is Mastermind?
Mastermind multi-turn game environment for Verifiers
What is the current top score on Mastermind?
The top reported score is 1.62 by GPT-5, across 12 models reporting (8 from frontier labs).
How can a model improve its Mastermind score?
Tools linked to Mastermind on Sophon include Mastermind RL Env (Community) - RL environments, datasets, and scaffolds that target this eval.
What license is Mastermind under?
Mastermind is available under unknown.