Cite
Notes
Only stored in your browser.
Attribution
$β$-DPO: Direct Preference Optimization with Dynamic $β$
arXiv 2024
Towards Robust Alignment of Language Models: Distributionally Robustifying Direct Preference Optimization
A Bi-Step Grounding Paradigm for Large Language Models in Recommendation Systems
arXiv 2023
from 3 papers
Bolin Ding
Jiancan Wu
Jinyang Gao
Junkang Wu
Xiang Wang
Xiangnan He
Yuexiang Xie
Chong Chen
Fuli Feng
Jiawei Chen