Lifan Yuan
UIUC PhD student and Tsinghua/OpenBMB collaborator; lead author of UltraInteract / Eurus and PRM800K-style process reward work.
- Role
- grad-student
- Currently at
- University of Illinois Urbana-Champaign
- GitHub
- github.com/lifan-yuan
- Scholar
- scholar.google.com/citations
- Papers
- 12
Cite
Notes
Only stored in your browser.
Authored papers
12The Entropy Mechanism of Reinforcement Learning for Reasoning Language Models
arXiv 2025
Process Reinforcement through Implicit Rewards
arXiv 2025
RLVE: Scaling Up Reinforcement Learning for Language Models with Adaptive Verifiable Environments
arXiv 2025
The Unreasonable Effectiveness of Entropy Minimization in LLM Reasoning
arXiv 2025
From f(x) and g(x) to f(g(x)): LLMs Learn New Skills in RL by Composing Old Ones
arXiv 2025
RLPR: Extrapolating RLVR to General Domains without Verifiers
arXiv 2025
Free Process Rewards without Process Labels
arXiv 2024
Noise Contrastive Alignment of Language Models with Explicit Rewards
arXiv 2024
Advancing LLM Reasoning Generalists with Preference Trees
arXiv 2024
Controllable Preference Optimization: Toward Controllable Multi-Objective Alignment
arXiv 2024
UltraFeedback: Boosting Language Models with High-quality Feedback
ICML
CRAFT: Customizing LLMs by Creating and Retrieving from Specialized Toolsets
arXiv 2023
Tool contributions
1Affiliations
Previously
Frequent co-authors
10from 12 papers