Simon Shaolei Du
- Papers
- 6
Cite
Notes
Only stored in your browser.
6papers
Authored papers
6Reinforcement Learning for Reasoning in Large Language Models with One Training Example
arXiv 2025
ThetaEvolve: Test-time Learning on Open Problems
arXiv 2025
RLVE: Scaling Up Reinforcement Learning for Language Models with Adaptive Verifiable Environments
arXiv 2025
Spurious Rewards: Rethinking Training Signals in RLVR
arXiv 2025
Free from Bellman Completeness: Trajectory Stitching via Model-based Return-conditioned Supervised Learning
arXiv 2023
LabelBench: A Comprehensive Framework for Benchmarking Adaptive Label-Efficient Learning
arXiv 2023
Affiliations
No known affiliations.
Frequent co-authors
10from 6 papers