GPQA: A Graduate-Level Google-Proof Q&A Benchmark
Introduces GPQA, 448 PhD-written multiple-choice questions in biology, physics, and chemistry that domain non-experts cannot solve even with web access.
- Publisher
- New York University
- Year
- 2023
- Venue
- COLM
- Authors
- 9
- Hosting
- External sourcelicense unknown
Cite
Notes
Only stored in your browser.
Introduces 2 artifacts - 2 evals
TL;DR
Semantic Scholar
GPQA, a challenging dataset of 448 multiple-choice questions written by domain experts in biology, physics, and chemistry, is presented, which it is hoped can help devise ways for human experts to reliably get truthful information from AI systems that surpass human capabilities.