Cite
Notes
Only stored in your browser.
Attribution
GRAM: A Generative Foundation Reward Model for Reward Generalization
arXiv 2025
RoVRM: A Robust Visual Reward Model Optimized via Auxiliary Textual Preference Data
arXiv 2024
ESRL: Efficient Sampling-based Reinforcement Learning for Sequence Generation
arXiv 2023
from 3 papers
Chenglong Wang
Jingbo Zhu
Tong Xiao
Yifu Huo
Bei Li
Chunliang Zhang
Murun Yang
Qiaozhi He
Yang Gan
Yongyu Mu