Cite
Notes
Only stored in your browser.
Attribution
Bridging Online and Offline RL: Contextual Bandit Learning for Multi-Turn Code Generation
arXiv 2026
Generalized-Smooth Nonconvex Optimization is As Efficient As Smooth Nonconvex Optimization
arXiv 2023
from 2 papers
Dongdong Chen
Huan Sun
Ruinan Jin
Yi Zhou
Yujia Xie
Zhaosong Lu
Ziru Chen
Ziyi Chen