Cite
Notes
Only stored in your browser.
Attribution
Training a Generally Curious Agent
arXiv 2025
Preference Fine-Tuning of LLMs Should Leverage Suboptimal, On-Policy Data
arXiv 2024
from 2 papers
Jeff Schneider
Abitha Thankaraj
Anikait Singh
Archit Sharma
Aviral Kumar
Chelsea Finn
J. Zico Kolter
Rafael Rafailov
Ruslan Salakhutdinov
professor
Stefano Ermon