Shuang Qiu
- Papers
- 5
Cite
Notes
Only stored in your browser.
5papers
Authored papers
5Self-Reflective Generation at Test Time
arXiv 2025
Segment Policy Optimization: Effective Segment-Level Credit Assignment in RL for Large Language Models
arXiv 2025
Rewards-in-Context: Multi-objective Alignment of Foundation Models with Dynamic Preference Adjustment
arXiv 2024
Arithmetic Control of LLMs for Diverse User Preferences: Directional Preference Alignment with Multi-Objective Rewards
arXiv 2024
Contrastive UCB: Provably Efficient Contrastive Self-Supervised Learning in Online Reinforcement Learning
arXiv 2022
Affiliations
No known affiliations.
Frequent co-authors
10from 5 papers