Binghai Wang
- Papers
- 4
Cite
Notes
Only stored in your browser.
4papers
Authored papers
4Outcome Accuracy is Not Enough: Aligning the Reasoning Process of Reward Models
arXiv 2026
WorldPM: Scaling Human Preference Modeling
arXiv 2025
Secrets of RLHF in Large Language Models Part II: Reward Modeling
arXiv 2024
RMB: Comprehensively Benchmarking Reward Models in LLM Alignment
arXiv 2024
Affiliations
No known affiliations.
Frequent co-authors
10from 4 papers