Cite
Notes
Only stored in your browser.
Attribution
Segmenting Text and Learning Their Rewards for Improved RLHF in Language Model
arXiv 2025
A Dense Reward View on Aligning Text-to-Image Diffusion with Preference
arXiv 2024
Preference-grounded Token-level Guidance for Language Model Fine-tuning
preference-grounded-token-level-guidance-for
from 3 papers
Mingyuan Zhou
Caiming Xiong
researcher
Congying Xia
Hany Awadalla
Shujian Zhang
Tianqi Chen
Weizhu Chen
Yihao Feng
Yueqin Yin
Yujia Xie