Xuekai Zhu
- Papers
- 14
Cite
Notes
Only stored in your browser.
Authored papers
14SEMA: Simple yet Effective Learning for Multi-Turn Jailbreak Attacks
arXiv 2026
SimpleVLA-RL: Scaling VLA Training via Reinforcement Learning
arXiv 2025
TTRL: Test-Time Reinforcement Learning
arXiv 2025
MedXpertQA: Benchmarking Expert-Level Medical Reasoning and Understanding
arXiv 2025
Seek in the Dark: Reasoning via Test-Time Instance-Level Policy Gradient in Latent Space
arXiv 2025
SSRL: Self-Search Reinforcement Learning
arXiv 2025
FlowRL: Matching Reward Distributions for LLM Reasoning
arXiv 2025
A Survey of Reinforcement Learning for Large Reasoning Models
arXiv 2025
Towards a Unified View of Large Language Model Post-Training
arXiv 2025
UltraMedical: Building Specialized Generalists in Biomedicine
arXiv 2024
Fourier Position Embedding: Enhancing Attention's Periodic Extension for Length Generalization
arXiv 2024
How to Synthesize Text Data without Model Collapse?
arXiv 2024
PaD: Program-aided Distillation Can Teach Small Models Reasoning Better than Chain-of-thought Fine-tuning
arXiv 2023
CRaSh: Clustering, Removing, and Sharing Enhance Fine-tuning without Full Large Language Model
arXiv 2023
Affiliations
Frequent co-authors
10from 14 papers