Cite
Notes
Only stored in your browser.
Attribution
Light-R1: Curriculum SFT, DPO and RL for Long COT from Scratch and Beyond
arXiv 2025
Chain-of-Thought Matters: Improving Long-Context Language Models with Reasoning Path Supervision
from 2 papers
Xiangzheng Zhang
Dawei Zhu
Fenrui Xiao
Guangxiang Zhao
Junchen Liu
Junfeng Ran
Liang Wen
Lifu Tang
Lin Sun
Qi An