Xingtai Lv
- Papers
- 12
Cite
Notes
Only stored in your browser.
Authored papers
12Post-Trained MoE Can Skip Half Experts via Self-Distillation
arXiv 2026
Process Reinforcement through Implicit Rewards
arXiv 2025
FlowRL: Matching Reward Distributions for LLM Reasoning
arXiv 2025
A Survey of Reinforcement Learning for Large Reasoning Models
arXiv 2025
Towards a Unified View of Large Language Model Post-Training
arXiv 2025
Fast and Slow Generating: An Empirical Study on Large and Small Language Models Collaborative Decoding
arXiv 2024
How to Synthesize Text Data without Model Collapse?
arXiv 2024
UltraMedical: Building Specialized Generalists in Biomedicine
arXiv 2024
Fourier Position Embedding: Enhancing Attention's Periodic Extension for Length Generalization
arXiv 2024
Intuitive Fine-Tuning: Towards Simplifying Alignment into a Single Process
arXiv 2024
OpenDelta: A Plug-and-play Library for Parameter-efficient Adaptation of Pre-trained Models
arXiv 2023
Sparse Low-rank Adaptation of Pre-trained Language Models
arXiv 2023
Affiliations
Frequent co-authors
10from 12 papers