Simon S. Du
- Papers
- 6
Cite
Notes
Only stored in your browser.
6papers
Authored papers
6MLS-Bench: A Holistic and Rigorous Assessment of AI Systems on Building Better AI
arXiv 2026
Extragradient Preference Optimization (EGPO): Beyond Last-Iterate Convergence for Nash Learning from Human Feedback
arXiv 2025
Understanding the Performance Gap in Preference Learning: A Dichotomy of RLHF and DPO
arXiv 2025
Unleashing the Power of Pre-trained Language Models for Offline Reinforcement Learning
arXiv 2023
Dichotomy of Early and Late Phase Implicit Biases Can Provably Induce Grokking
arXiv 2023
Denoised MDPs: Learning World Models Better Than the World Itself
arXiv 2022
Affiliations
No known affiliations.
Frequent co-authors
10from 6 papers