Huizhuo Yuan
- Papers
- 6
Cite
Notes
Only stored in your browser.
6papers
Authored papers
6Tensor Product Attention Is All You Need
arXiv 2025
On the Design of KL-Regularized Policy Gradient Algorithms for LLM Reasoning
arXiv 2025
Group Representational Position Encoding
arXiv 2025
MARS: Unleashing the Power of Variance Reduction for Training Large Models
arXiv 2024
Self-Play Fine-Tuning Converts Weak Language Models to Strong Language Models
arXiv 2024
Self-Play Preference Optimization for Language Model Alignment
arXiv 2024
Affiliations
No known affiliations.
Frequent co-authors
10from 6 papers