Cite
Notes
Only stored in your browser.
Attribution
Consistent Paths Lead to Truth: Self-Rewarding Reinforcement Learning for LLM Reasoning
arXiv 2025
Reasoning with Reinforced Functional Token Tuning
from 2 papers
Baisheng Lai
DaCheng Tao
Kongcheng Zhang
Mingli Song
Shunyu Liu
Jiaxing Huang
Jieping Ye
Wenkai Fang
Yingjie Wang