Cite
Notes
Only stored in your browser.
Attribution
Understanding the Performance Gap in Preference Learning: A Dichotomy of RLHF and DPO
arXiv 2025
from 1 papers
Maryam Fazel
Ruizhe Shi
Runlong Zhou
Simon S. Du
Zihan Zhang