Hanning Zhang

Papers: 5

Cite

Notes

Only stored in your browser.

Attribution

Affiliations & profile: Semantic Scholar

Attribution policy →

5papers

Authored papers

Optimizing Chain-of-Thought Reasoners via Gradient Variance Minimization in Rejection Sampling and RL

arXiv 2025

2025

Self-rewarding correction for mathematical reasoning

arXiv 2025

2025

Entropy-Regularized Process Reward Model

arXiv 2024

2024

R-Tuning: Instructing Large Language Models to Say `I Don't Know'

arXiv 2023

2023

Mitigating the Alignment Tax of RLHF

arXiv 2023

2023

Affiliations

No known affiliations.

Frequent co-authors

from 5 papers

Tong Zhang

Hanze Dong

Nan Jiang

Shizhe Diao

Wei Xiong

Yong Lin

Heng Ji

professor

Rui Pan

Chenlu Ye

Dylan Zhang