0

UncertaintyBench

Saturated

Implementation of UncertaintyBench

Domain
rl-env
License
unknown
Published
Oct 2025

Cite

Notes

Only stored in your browser.

Attribution

Leaderboard scores
prime-hub
Attribution policy →

Top score 100.0% by GPT-4.1 Mini - 2 models reporting (1 frontier)

Top models

2
UncertaintyBenchBar chart with 2 bars. Highest value: GPT-4.1 Mini at 100.
2 models

Related tools

1
View all

Implementations, trainers, datasets and scaffolds linked to this eval.

FAQ

What is UncertaintyBench?
Implementation of UncertaintyBench
What is the current top score on UncertaintyBench?
The top reported score is 100.0% by GPT-4.1 Mini, across 2 models reporting (1 from frontier labs).
How can a model improve its UncertaintyBench score?
Tools linked to UncertaintyBench on Sophon include Uncertaintybench RL Env (Prime Intellect) - RL environments, datasets, and scaffolds that target this eval.
What license is UncertaintyBench under?
UncertaintyBench is available under unknown.