Cite
Notes
Only stored in your browser.
Attribution
Cognitive Behaviors that Enable Self-Improving Reasoners, or, Four Habits of Highly Effective STaRs
arXiv 2025
Big-Math: A Large-Scale, High-Quality Math Dataset for Reinforcement Learning in Language Models
from 2 papers
Anikait Singh
Kanishk Gandhi
Alon Albalak
Ayush Chakravarthy
Chase Blagden
Dakota Mahan
Duy Phung
Louis Castricato
Nick Haber
Noah D. Goodman