Cite
Notes
Only stored in your browser.
Attribution
Your Language Model is Its Own Critic: Reinforcement Learning with Value Estimation from Actor's Internal States
arXiv 2026
DAHL: Domain-specific Automated Hallucination Evaluation of Long-Form Text through a Benchmark Dataset in Biomedicine
arXiv 2024
from 2 papers
Dongjun Jang
Hyopil Shin
Jean Seo
Jeonghoon Shim
Minjae Oh
Woojin Ahn
Yohan Jo
Yunho Choi