OpenThoughts
Fresh
A fully-open distillation of long DeepSeek-R1 reasoning traces - the community's flagship "open R1" SFT corpus for reasoning models.
- Type
- SFT Dataset
- Publisher
- Open Thoughts
- Capabilities
- Code GenerationMathScientific Reasoning
- Runtime
hf_parquet- License
- Apache-2.0
- Size
- 114k traces (OpenThoughts-114k) — 1.2M in OpenThoughts2
- Published
- May 2026
Cite
Notes
Only stored in your browser.
Lift evidence
4| Eval | Tools known to lift | Source paper |
|---|---|---|
| AIME 2024: Problems from the American Invitational Mathematics Examination | OpenThoughts | - |
| MATH-500 | OpenThoughts | - |
| GPQA Diamond | OpenThoughts | - |
| LiveCodeBench | OpenThoughts | - |
Models
Notable models trained on it
OpenThinker-7B / 32BOpenThinker2-32Bmany academic R1-style reproductions