Cite
Notes
Only stored in your browser.
Attribution
Optimizing Chain-of-Thought Reasoners via Gradient Variance Minimization in Rejection Sampling and RL
arXiv 2025
from 1 papers
Hanning Zhang
Hanze Dong
Jiarui Yao
Nan Jiang
Tong Zhang
Wei Xiong