Cite
Notes
Only stored in your browser.
Attribution
Stop Summation: Min-Form Credit Assignment Is All Process Reward Model Needs for Reasoning
arXiv 2025
Scaling Offline Model-Based RL via Jointly-Optimized World-Action Model Pretraining
arXiv 2024
from 2 papers
Gang Xiong
Jie Cheng
Ruixi Qiao
Binhua Li
Chao Guo
Fei-Yue Wang
Junle Wang
Lijun Li
Qinghai Miao
Yingwei Ma