Archit Sharma
- Papers
- 6
Cite
Notes
Only stored in your browser.
6papers
Authored papers
6FSPO: Few-Shot Preference Optimization of Synthetic Preference Data in LLMs Elicits Effective Personalization to Real Users
arXiv 2025
SERL: A Software Suite for Sample-Efficient Robotic Reinforcement Learning
arXiv 2024
A Critical Evaluation of AI Feedback for Aligning Large Language Models
arXiv 2024
Preference Fine-Tuning of LLMs Should Leverage Suboptimal, On-Policy Data
arXiv 2024
RLVF: Learning from Verbal Feedback without Overgeneralization
arXiv 2024
Stream of Search (SoS): Learning to Search in Language
arXiv 2024
Affiliations
No known affiliations.
Frequent co-authors
10from 6 papers