Yuxin Zuo
- Papers
- 13
Cite
Notes
Only stored in your browser.
Authored papers
13Rethinking On-Policy Distillation of Large Language Models: Phenomenology, Mechanism, and Recipe
arXiv 2026
Post-Trained MoE Can Skip Half Experts via Self-Distillation
arXiv 2026
P1-VL: Bridging Visual Perception and Scientific Reasoning in Physics Olympiads
arXiv 2026
SimpleVLA-RL: Scaling VLA Training via Reinforcement Learning
arXiv 2025
TTRL: Test-Time Reinforcement Learning
arXiv 2025
The Entropy Mechanism of Reinforcement Learning for Reasoning Language Models
arXiv 2025
MedXpertQA: Benchmarking Expert-Level Medical Reasoning and Understanding
arXiv 2025
P1: Mastering Physics Olympiads with Reinforcement Learning
arXiv 2025
Towards a Unified View of Large Language Model Post-Training
arXiv 2025
A Survey of Reinforcement Learning for Large Reasoning Models
arXiv 2025
SSRL: Self-Search Reinforcement Learning
arXiv 2025
FlowRL: Matching Reward Distributions for LLM Reasoning
arXiv 2025
JustRL: Scaling a 1.5B LLM with a Simple RL Recipe
arXiv 2025
Affiliations
Frequent co-authors
10from 13 papers