0

AIME 2025: Problems from the American Invitational Mathematics Examination

Saturated

A benchmark for evaluating AI's ability to solve challenging mathematics problems from the 2025 AIME - a prestigious high school mathematics competition.

Domain
Mathematics
License
mit
Published
Oct 2025
Notable for
Benchmark for evaluating Mathematics.

Cite

Notes

Only stored in your browser.

Top score 98.7% by GPT-5 Codex - 207 models reporting (45 frontier)

Score history

207
0%25%50%75%100%Feb 24Jul 24Dec 24May 25Oct 25Phi-4 Mini InstructGPT-4o-miniPhi 4Grok 3 minio4 MiniGrok 4GPT-5 Codex

Top models

207
AIME 2025: Problems from the American Invitational Mathematics ExaminationBar chart with 21 bars. Highest value: GPT-5 Codex at 98.7.
21 models

Related tools

5
View all

Implementations, trainers, datasets and scaffolds linked to this eval.

FAQ

What is AIME 2025: Problems from the American Invitational Mathematics Examination?
A benchmark for evaluating AI's ability to solve challenging mathematics problems from the 2025 AIME - a prestigious high school mathematics competition.
What is the current top score on AIME 2025: Problems from the American Invitational Mathematics Examination?
The top reported score is 98.7% by GPT-5 Codex, across 207 models reporting (45 from frontier labs).
How can a model improve its AIME 2025: Problems from the American Invitational Mathematics Examination score?
Tools linked to AIME 2025: Problems from the American Invitational Mathematics Examination on Sophon include AIME 2025 RL Env (Dev Team), AIME 2025 RL Env (Prime Intellect), AIME 2025 RL Env (Community), VF Openbench RL Env (Community) - RL environments, datasets, and scaffolds that target this eval.
What license is AIME 2025: Problems from the American Invitational Mathematics Examination under?
AIME 2025: Problems from the American Invitational Mathematics Examination is available under mit.