Cite
Notes
Only stored in your browser.
Attribution
Beyond Reward: Offline Preference-guided Policy Optimization
arXiv 2023
from 1 papers
Donglin Wang
Jinxin Liu
Li He
Yachen Kang