Cite
Notes
Only stored in your browser.
Attribution
Self-Exploring Language Models: Active Preference Elicitation for Online Alignment
arXiv 2024
Fine-Tuning Language Models with Advantage-Induced Policy Alignment
arXiv 2023
from 2 papers
Banghua Zhu
professor
Chenguang Zhu
Donghan Yu
Felipe Vieira Frujeri
Han Zhong
Hany Hassan
Jiantao Jiao
Michael. I. Jordan
Shenao Zhang
Shi Dong