0

AIME as an LLM Evaluation Benchmark

Convention of using the American Invitational Mathematics Examination (AIME) - 15 hard math problems per year - to evaluate frontier reasoning models.

Year
2024
Venue
blog

Cite

Notes

Only stored in your browser.