0

GPQA: A Graduate-Level Google-Proof Q&A Benchmark

Introduces GPQA, 448 PhD-written multiple-choice questions in biology, physics, and chemistry that domain non-experts cannot solve even with web access.

Year
2023
Venue
COLM
Authors
9
Hosting
External sourcelicense unknown

Cite

Notes

Only stored in your browser.

Introduces 2 artifacts - 2 evals

TL;DR

Semantic Scholar

GPQA, a challenging dataset of 448 multiple-choice questions written by domain experts in biology, physics, and chemistry, is presented, which it is hoped can help devise ways for human experts to reliably get truthful information from AI systems that surpass human capabilities.

Artifacts

2

Authors

9