Yuxin Zuo

Papers: 13

Cite

Notes

Only stored in your browser.

Attribution

Affiliations & profile: Semantic Scholar

Attribution policy →

13papers

Authored papers

Rethinking On-Policy Distillation of Large Language Models: Phenomenology, Mechanism, and Recipe

arXiv 2026

2026

Post-Trained MoE Can Skip Half Experts via Self-Distillation

arXiv 2026

2026

P1-VL: Bridging Visual Perception and Scientific Reasoning in Physics Olympiads

arXiv 2026

2026

SimpleVLA-RL: Scaling VLA Training via Reinforcement Learning

arXiv 2025

2025

TTRL: Test-Time Reinforcement Learning

arXiv 2025

2025

The Entropy Mechanism of Reinforcement Learning for Reasoning Language Models

arXiv 2025

2025

MedXpertQA: Benchmarking Expert-Level Medical Reasoning and Understanding

arXiv 2025

2025

A Survey of Reinforcement Learning for Large Reasoning Models

arXiv 2025

2025

SSRL: Self-Search Reinforcement Learning

arXiv 2025

2025

FlowRL: Matching Reward Distributions for LLM Reasoning

arXiv 2025

2025

P1: Mastering Physics Olympiads with Reinforcement Learning

arXiv 2025

2025

Towards a Unified View of Large Language Model Post-Training

arXiv 2025

2025

JustRL: Scaling a 1.5B LLM with a Simple RL Recipe

arXiv 2025

2025

Affiliations

No known affiliations.

Frequent co-authors

from 13 papers

Ning Ding

researcher

13 shared papers

Bowen Zhou

professor

11 shared papers

Ganqu Cui

researcher

Kaiyan Zhang

Xuekai Zhu

Youbang Sun

Yuchen Zhang

Yuchen Fan

Lei Bai

Li Sheng