AIME2025
Fresh
Problems from the American Invitational Mathematics Examination (AIME) 2025-I & II.
- Type
- RL Env
- Publisher
- General Reasoning
- Runtime
ORS- License
- unknown
- Size
- 30 tasks
- Published
- Jan 2026
Cite
Notes
Only stored in your browser.
Public scores on this env
1415 vf-eval reports across 14 models
1Grok 4 Heavy (with python)xAI1002Claude Sonnet 4.5Anthropic1003GPT-5.2OpenAI1004Step 3.5 Flash (parallel thinking)99.95Claude Opus 4.6Anthropic99.796Grok 4xAI98.87Step 3.5 Flash97.38Kimi K2.5Moonshot AI96.19Gemini 3 ProGoogle (Alphabet Inc.)9510o4 MiniOpenAI92.711o3OpenAI88.912o3 MiniOpenAI86.513MiniMax M2.5Minimax86.314Qwen 3 Coder NextAlibaba83.07
Open the scoring view →