0

ProofBench

Frontier

Automated theorem proving benchmark

Publisher
Vals AI
Domain
Mathematics
Published
Jun 2026
Official leaderboard
vals.ai/benchmarks/proof_bench

Cite

Notes

Only stored in your browser.

Attribution

Leaderboard scores
Vals AI
Attribution policy →

Top score 69.0% by Claude Opus 4.8 - 30 models reporting (19 frontier)

Score history

30
0%25%50%75%100%Aug 25Oct 25Dec 25Feb 26Apr 26GPT-5Claude Sonnet 4.5Gemini 3 Pro PreviewClaude Opus 4.6GPT-5.4Claude Opus 4.8

Top models

30
ProofBenchBar chart with 21 bars. Highest value: Claude Opus 4.8 at 69.
21 models

Where it's ranked

1

FAQ

What is ProofBench?
Automated theorem proving benchmark
What is the current top score on ProofBench?
The top reported score is 69.0% by Claude Opus 4.8, across 30 models reporting (19 from frontier labs).