Cite
Notes
Only stored in your browser.
Attribution
Your Language Model is Its Own Critic: Reinforcement Learning with Value Estimation from Actor's Internal States
arXiv 2026
from 1 papers
Jeonghoon Shim
Jongwon Lim
Woojin Ahn
Yohan Jo
Yunho Choi