0

Verifiers Math (math-python)

Multi-turn math problem-solving environment where the model proposes Python code in a sandbox to compute and verify numerical answers.

Type
RL Env
Runtime
verifiers
License
MIT
Size
1 env, thousands of problems (MATH, GSM8K, AIME-style)
Published
Jan 2025

Cite

Notes

Only stored in your browser.