MATH: Measuring Mathematical Problem Solving
Active
Dataset of 12,500 challenging competition mathematics problems. Demonstrates fewshot prompting and custom scorers. NOTE: The dataset has been taken down due to a DMCA notice from The Art of Problem Solving.
- Publisher
- University of California, Berkeley
- Domain
- Mathematics
- License
- mit
- Published
- May 2026
- Notable for
- Benchmark for evaluating Mathematics.
Cite
Notes
Only stored in your browser.
Related tools
4Implementations, trainers, datasets and scaffolds linked to this eval.
Papers
1FAQ
- What is MATH: Measuring Mathematical Problem Solving?
- Dataset of 12,500 challenging competition mathematics problems. Demonstrates fewshot prompting and custom scorers. NOTE: The dataset has been taken down due to a DMCA notice from The Art of Problem Solving.
- How can a model improve its MATH: Measuring Mathematical Problem Solving score?
- Tools linked to MATH: Measuring Mathematical Problem Solving on Sophon include Hendrycks MATH RL Env (Community), Hendrycks MATH RL Env (Prime Intellect), Hendrycksmath RL Env (Community), MATH 500 RL Env (Prime Intellect) - RL environments, datasets, and scaffolds that target this eval.
- What license is MATH: Measuring Mathematical Problem Solving under?
- MATH: Measuring Mathematical Problem Solving is available under mit.