0

NuminaMath

Fresh

An 860k-problem competition-math dataset with detailed solutions, the open community's go-to corpus for training math-specialized LLMs.

Type
SFT Dataset
Publisher
Numina
Runtime
hf_parquet
License
Apache-2.0
Size
860k problem-solution pairs (CoT variant), 73k tool-integrated (TIR)
Published
May 2026

Cite

Notes

Only stored in your browser.

Lift evidence

4

Models

Notable models trained on it

NuminaMath-7B-TIR (won AIMO Progress Prize 1, 2024)DeepSeek-Math derivativesQwen2.5-Math fine-tunescomponent of OpenThoughts and many reasoning mixtures

Papers

1

Contributors

3