Jonathan D. Chang
- Papers
- 5
Cite
Notes
Only stored in your browser.
5papers
Authored papers
5REBEL: Reinforcement Learning via Regressing Relative Rewards
arXiv 2024
Dataset Reset Policy Optimization for RLHF
arXiv 2024
Regressing the Relative Future: Efficient Policy Optimization for Multi-turn RLHF
arXiv 2024
Critique-out-Loud Reward Models
arXiv 2024
Learning to Generate Better Than Your LLM
arXiv 2023
Affiliations
No known affiliations.
Frequent co-authors
10from 5 papers