Cite
Notes
Only stored in your browser.
Attribution
Your Language Model is Its Own Critic: Reinforcement Learning with Value Estimation from Actor's Internal States
arXiv 2026
ToolDial: Multi-turn Dialogue Generation Method for Tool-Augmented Language Models
arXiv 2025
from 2 papers
Yohan Jo
Cheongsu Lim
Gyuhyeon Seo
Jongwon Lim
Minjae Oh
Woojin Ahn
Yunho Choi