FrontierMath: A Benchmark for Evaluating Advanced Mathematical Reasoning in AI
Epoch AI benchmark of hundreds of original research-level math problems authored by professional mathematicians, with auto-verifiable answers.
- Publisher
- Epoch AI
- Year
- 2024
- Venue
- preprint
- Authors
- 10
- Hosting
- External sourcelicense unknown
Cite
Notes
Only stored in your browser.
Introduces 1 artifact - 1 eval
Artifacts
1Evals