Lifan Yuan

UIUC PhD student and Tsinghua/OpenBMB collaborator; lead author of UltraInteract / Eurus and PRM800K-style process reward work.

Role: grad-student
Currently at: University of Illinois Urbana-Champaign
GitHub: github.com/lifan-yuan
Scholar: scholar.google.com/citations
Papers: 12

Cite

Notes

Only stored in your browser.

Attribution

Affiliations & profile: scholar.google.com/citations

Attribution policy →

12papers·1tool contribs

Authored papers

The Entropy Mechanism of Reinforcement Learning for Reasoning Language Models

arXiv 2025

2025

Process Reinforcement through Implicit Rewards

arXiv 2025

2025

RLVE: Scaling Up Reinforcement Learning for Language Models with Adaptive Verifiable Environments

arXiv 2025

2025

From f(x) and g(x) to f(g(x)): LLMs Learn New Skills in RL by Composing Old Ones

arXiv 2025

2025

RLPR: Extrapolating RLVR to General Domains without Verifiers

arXiv 2025

2025

The Unreasonable Effectiveness of Entropy Minimization in LLM Reasoning

arXiv 2025

2025

Noise Contrastive Alignment of Language Models with Explicit Rewards

arXiv 2024

2024

Controllable Preference Optimization: Toward Controllable Multi-Objective Alignment

arXiv 2024

2024

Advancing LLM Reasoning Generalists with Preference Trees

arXiv 2024

2024

Free Process Rewards without Process Labels

arXiv 2024

2024

UltraFeedback: Boosting Language Models with High-quality Feedback

ICML

2023

CRAFT: Customizing LLMs by Creating and Retrieving from Specialized Toolsets

arXiv 2023

2023

Tool contributions

UltraFeedback

OpenBMB

OpenBMB's 64k-prompt preference dataset built with GPT-4 critiques across instruction-following, truthfulness, honesty, and helpfulness - the de facto open DPO baseline.

PreferenceInstruction FollowingHallucinationSafety

Affiliations

Currently at

University of Illinois Urbana-Champaign

grad-student · university lab

Previously

Tsinghua Universityuniversity lab

Frequent co-authors

from 12 papers

Ganqu Cui

researcher

9 shared papers

Hao Peng

8 shared papers

Ning Ding

researcher

8 shared papers

Zhiyuan Liu

professor

8 shared papers

Maosong Sun

professor

6 shared papers

Bowen Zhou

professor

Huayu Chen

Hanbin Wang

Ruobing Xie

Weize Chen