Cite
Notes
Only stored in your browser.
Attribution
InftyThink+: Effective and Efficient Infinite-Horizon Reasoning via Reinforcement Learning
arXiv 2026
Ring-lite: Scalable Reasoning via C3PO-Stabilized Reinforcement Learning for LLMs
arXiv 2025
Every Step Evolves: Scaling Reinforcement Learning for Trillion-Scale Thinking Model
from 3 papers
Jun Zhou
Shuaicheng Li
Zhiqiang Zhang
Zujie Wen
Bin Hu
Cai Chen
Deng Zhao
Hao Dai
Jia Guo
Jiaming Liu