Cite
Notes
Only stored in your browser.
Attribution
Do Not Let Low-Probability Tokens Over-Dominate in RL for LLMs
arXiv 2025
Mitigating Hallucinations in Large Vision-Language Models via DPO: On-Policy Data Hold the Key
CVPR 2025 1
from 2 papers
Dongqi Han
Dongsheng Li
Xufang Luo
Zhihe Yang
Zhiyuan He
Zilong Wang