Shenao Zhang
- Papers
- 6
Cite
Notes
Only stored in your browser.
6papers
Authored papers
6Beyond Markovian: Reflective Exploration via Bayes-Adaptive RL for LLM Reasoning
arXiv 2025
Reward-Augmented Data Enhances Direct Preference Alignment of LLMs
arXiv 2024
Self-Exploring Language Models: Active Preference Elicitation for Online Alignment
arXiv 2024
Offline Reinforcement Learning for LLM Multi-Step Reasoning
arXiv 2024
Maximize to Explore: One Objective Function Fusing Estimation, Planning, and Exploration
NeurIPS 2023 11
Reason for Future, Act for Now: A Principled Framework for Autonomous LLM Agents with Provable Sample Efficiency
arXiv 2023
Affiliations
No known affiliations.
Frequent co-authors
10from 6 papers