Bingxiang He
- Papers
- 8
Cite
Notes
Only stored in your browser.
8papers
Authored papers
8Rethinking On-Policy Distillation of Large Language Models: Phenomenology, Mechanism, and Recipe
arXiv 2026
MiniCPM4: Ultra-Efficient LLMs on End Devices
arXiv 2025
Process Reinforcement through Implicit Rewards
arXiv 2025
MiniCPM-V 4.5: Cooking Efficient MLLMs via Architecture, Data, and Training Recipe
arXiv 2025
A Survey of Reinforcement Learning for Large Reasoning Models
arXiv 2025
JustRL: Scaling a 1.5B LLM with a Simple RL Recipe
arXiv 2025
Tell Me More! Towards Implicit User Intention Understanding of Language Model Driven Agents
arXiv 2024
UltraFeedback: Boosting Language Models with High-quality Feedback
ICML
Affiliations
No known affiliations.
Frequent co-authors
10from 8 papers