GDPval: Evaluating AI Model Performance on Real-World Economically Valuable Tasks
OpenAI's eval of frontier models against expert deliverables in 44 occupations spanning the top GDP-contributing sectors of the US economy, judged blind by industry experts.
- Publisher
- OpenAI
- Year
- 2025
- Venue
- preprint
- Authors
- 13
- Hosting
- External sourcelicense unknown
Cite
Notes
Only stored in your browser.
Introduces 1 artifact - 1 eval
TL;DR
Semantic Scholar
It is found that frontier model performance on GDPval is improving roughly linearly over time, and that the current best frontier models are approaching industry experts in deliverable quality.
Artifacts
1Evals