MathArena
Frontier
Live leaderboard of LLMs on recent math-olympiad and research-style problems, refreshed monthly to minimise pretraining contamination.
- Domain
- math
- Published
- May 2026
- Updates
- Monthly
- Notable for
- Continuously refreshed math benchmark — pulls fresh problems each month so frontier models can't game contamination, and reports per-problem accuracy with confidence intervals.
- Canonical
- matharena.ai
- Official leaderboard
- matharena.ai
Cite
Notes
Only stored in your browser.
Top score 92.8% by GPT-5.5 - 23 models reporting (9 frontier)
Score history
22Top models
23Where it's ranked
1Related tools
1Implementations, trainers, datasets and scaffolds linked to this eval.
FAQ
- What is MathArena?
- Live leaderboard of LLMs on recent math-olympiad and research-style problems, refreshed monthly to minimise pretraining contamination.
- What is the current top score on MathArena?
- The top reported score is 92.8% by GPT-5.5, across 23 models reporting (9 from frontier labs).
- How can a model improve its MathArena score?
- Tools linked to MathArena on Sophon include APEX Shortlist RL Env (Prime Intellect) - RL environments, datasets, and scaffolds that target this eval.
