Cite
Notes
Only stored in your browser.
Attribution
Agent models: Internalizing Chain-of-Action Generation into Reasoning models
arXiv 2025
OpenRFT: Adapting Reasoning Foundation Model for Domain-specific Tasks with Reinforcement Fine-Tuning
arXiv 2024
o1-Coder: an o1 Replication for Coding
from 3 papers
Jitao Sang
YuQi Yang
Yuxiang Zhang
Jinlin Xiao
Chao Kong
Shangxi Wu
Xinyan Wen
Yuhang Wang