PRM800K
Active
800,000 step-level human labels on GPT-4 solutions to MATH problems - the canonical process-reward training/eval dataset.
- Publisher
- OpenAI
- Capabilities
- MathLLM Judging
- Domain
- math
- Format
- HF Dataset
- Size
- 800000 tasks
- License
- MIT
- Published
- May 2023
- Notable for
- Benchmark for evaluating math and llm judging in the math domain.
- Canonical
- github.com/openai/prm800k
Cite
Notes
Only stored in your browser.
Papers
2FAQ
- What is PRM800K?
- 800,000 step-level human labels on GPT-4 solutions to MATH problems - the canonical process-reward training/eval dataset.
- What capabilities does PRM800K test?
- PRM800K evaluates math, llm judging.
- What license is PRM800K under?
- PRM800K is available under MIT.