Kanishk Gandhi
- Papers
- 6
Cite
Notes
Only stored in your browser.
6papers
Authored papers
6Endless Terminals: Scaling RL Environments for Terminal Agents
arXiv 2026
Cognitive Behaviors that Enable Self-Improving Reasoners, or, Four Habits of Highly Effective STaRs
arXiv 2025
BoxingGym: Benchmarking Progress in Automated Experimental Design and Model Discovery
arXiv 2025
Big-Math: A Large-Scale, High-Quality Math Dataset for Reinforcement Learning in Language Models
arXiv 2025
Self-Supervised Alignment with Mutual Information: Learning to Follow Principles without Preference Labels
arXiv 2024
Stream of Search (SoS): Learning to Search in Language
arXiv 2024
Affiliations
No known affiliations.
Frequent co-authors
10from 6 papers