Guorui Zhou
- Papers
- 14
Cite
Notes
Only stored in your browser.
Authored papers
14Agentic Reinforced Policy Optimization
arXiv 2025
Tool-Star: Empowering LLM-Brained Multi-Tool Reasoner via Reinforcement Learning
arXiv 2025
Modality Curation: Building Universal Embeddings for Advanced Multimodal Information Retrieval
arXiv 2025
Thyme: Think Beyond Images
arXiv 2025
Agentic Entropy-Balanced Policy Optimization
arXiv 2025
Kwai Keye-VL 1.5 Technical Report
arXiv 2025
CE-GPPO: Controlling Entropy via Gradient-Preserving Clipping Policy Optimization in Reinforcement Learning
arXiv 2025
Klear-Reasoner: Advancing Reasoning Capability via Gradient-Preserving Clipping Policy Optimization
arXiv 2025
Stabilizing Knowledge, Promoting Reasoning: Dual-Token Constraints for RLVR
arXiv 2025
Leanabell-Prover: Posttraining Scaling in Formal Reasoning
arXiv 2025
RLEP: Reinforcement Learning with Experience Replay for LLM Reasoning
arXiv 2025
ASPO: Asymmetric Importance Sampling Policy Optimization
arXiv 2025
Efficient Training of Diffusion Mixture-of-Experts Models: A Practical Recipe
arXiv 2025
Deep Interest Network for Click-Through Rate Prediction
arXiv 2017
Affiliations
Frequent co-authors
10from 14 papers