Kaiyan Zhang
- Papers
- 24
Cite
Notes
Only stored in your browser.
Authored papers
24Post-Trained MoE Can Skip Half Experts via Self-Distillation
arXiv 2026
SimpleVLA-RL: Scaling VLA Training via Reinforcement Learning
arXiv 2025
TTRL: Test-Time Reinforcement Learning
arXiv 2025
Process Reinforcement through Implicit Rewards
arXiv 2025
MedXpertQA: Benchmarking Expert-Level Medical Reasoning and Understanding
arXiv 2025
A Survey of Reinforcement Learning for Large Reasoning Models
arXiv 2025
SSRL: Self-Search Reinforcement Learning
arXiv 2025
FlowRL: Matching Reward Distributions for LLM Reasoning
arXiv 2025
Towards a Unified View of Large Language Model Post-Training
arXiv 2025
SciArena: An Open Evaluation Platform for Foundation Models in Scientific Literature Tasks
arXiv 2025
Video-T1: Test-Time Scaling for Video Generation
ICCV 2025
Can 1B LLM Surpass 405B LLM? Rethinking Compute-Optimal Test-Time Scaling
arXiv 2025
GenPRM: Scaling Test-Time Compute of Process Reward Models via Generative Reasoning
arXiv 2025
JustRL: Scaling a 1.5B LLM with a Simple RL Recipe
arXiv 2025
Efficient Diffusion Models: A Comprehensive Survey from Principles to Practices
arXiv 2024
UltraMedical: Building Specialized Generalists in Biomedicine
arXiv 2024
Free Process Rewards without Process Labels
arXiv 2024
Large Language Models as Biomedical Hypothesis Generators: A Comprehensive Evaluation
arXiv 2024
Fast and Slow Generating: An Empirical Study on Large and Small Language Models Collaborative Decoding
arXiv 2024
How to Synthesize Text Data without Model Collapse?
arXiv 2024
Fourier Position Embedding: Enhancing Attention's Periodic Extension for Length Generalization
arXiv 2024
Intuitive Fine-Tuning: Towards Simplifying Alignment into a Single Process
arXiv 2024
PaD: Program-aided Distillation Can Teach Small Models Reasoning Better than Chain-of-thought Fine-tuning
arXiv 2023
CRaSh: Clustering, Removing, and Sharing Enhance Fine-tuning without Full Large Language Model
arXiv 2023
Affiliations
Frequent co-authors
10from 24 papers