Cite
Notes
Only stored in your browser.
Attribution
FSPO: Few-Shot Preference Optimization of Synthetic Preference Data in LLMs Elicits Effective Personalization to Real Users
arXiv 2025
RLVF: Learning from Verbal Feedback without Overgeneralization
arXiv 2024
from 2 papers
Archit Sharma
Chelsea Finn
Eric Mitchell
Alexander Khazatsky
Anikait Singh
Annie S. Chen
Kyle Hsu
Moritz Stephan
Stefano Ermon
Tatsunori Hashimoto
professor