0

GDPval: Evaluating AI Model Performance on Real-World Economically Valuable Tasks

OpenAI's eval of frontier models against expert deliverables in 44 occupations spanning the top GDP-contributing sectors of the US economy, judged blind by industry experts.

Publisher
OpenAI
Year
2025
Venue
preprint
Authors
13
Hosting
External sourcelicense unknown

Cite

Notes

Only stored in your browser.

Introduces 1 artifact - 1 eval

TL;DR

Semantic Scholar

It is found that frontier model performance on GDPval is improving roughly linearly over time, and that the current best frontier models are approaching industry experts in deliverable quality.

Artifacts

1

Evals

Authors

13