Cite
Notes
Only stored in your browser.
Attribution
SPRec: Leveraging Self-Play to Debias Preference Alignment for Large Language Model-based Recommendations
arXiv 2024
StepTool: A Step-grained Reinforcement Learning Framework for Tool Learning in LLMs
from 2 papers
Chongming Gao
Chuhan Wu
Jingtao Zhan
Kexin Huang
Min Zhang
Ruijun Chen
Shuai Wang
Shuai Yuan
Weizhi Ma
Xiangnan He