Cite
Notes
Only stored in your browser.
Attribution
Beyond Reward: Offline Preference-guided Policy Optimization
arXiv 2023
from 1 papers
Diyuan Shi
Donglin Wang
Jinxin Liu
Li He