Cite
Notes
Only stored in your browser.
Attribution
Reasoning with Reinforced Functional Token Tuning
arXiv 2025
Consistent Paths Lead to Truth: Self-Rewarding Reinforcement Learning for LLM Reasoning
from 2 papers
DaCheng Tao
Kongcheng Zhang
Mingli Song
Qi Yao
Shunyu Liu
Jiaxing Huang
Jieping Ye
Wenkai Fang
Yingjie Wang