GPQA: Graduate-Level STEM Knowledge Challenge
Active
Contains challenging multiple-choice questions created by domain experts in biology, physics, and chemistry, designed to test advanced scientific understanding beyond basic internet searches. Experts at PhD level in the corresponding domains reach 65% accuracy.
- Publisher
- New York University
- Domain
- Knowledge
- License
- mit
- Published
- Oct 2024
- Notable for
- Benchmark for evaluating Knowledge.
Cite
Notes
Only stored in your browser.
Related tools
2Implementations, trainers, datasets and scaffolds linked to this eval.
Papers
1FAQ
- What is GPQA: Graduate-Level STEM Knowledge Challenge?
- Contains challenging multiple-choice questions created by domain experts in biology, physics, and chemistry, designed to test advanced scientific understanding beyond basic internet searches. Experts at PhD level in the corresponding domains reach 65% accuracy.
- How can a model improve its GPQA: Graduate-Level STEM Knowledge Challenge score?
- Tools linked to GPQA: Graduate-Level STEM Knowledge Challenge on Sophon include GPQA RL Env (Prime Intellect), GPQA Diamond RL Env (Community) - RL environments, datasets, and scaffolds that target this eval.
- What license is GPQA: Graduate-Level STEM Knowledge Challenge under?
- GPQA: Graduate-Level STEM Knowledge Challenge is available under mit.