0

PRM800K

Active

800,000 step-level human labels on GPT-4 solutions to MATH problems - the canonical process-reward training/eval dataset.

Publisher
OpenAI
Capabilities
MathLLM Judging
Domain
math
Format
HF Dataset
Size
800000 tasks
License
MIT
Published
May 2023
Notable for
Benchmark for evaluating math and llm judging in the math domain.

Cite

Notes

Only stored in your browser.