Cite
Notes
Only stored in your browser.
Attribution
T^2PO: Uncertainty-Guided Exploration Control for Stable Multi-Turn Agentic Reinforcement Learning
arXiv 2026
from 1 papers
Chenwei Zhang
Haixin Wang
Hejie Cui
Shijie Geng
Shuowei Jin
Xin Liu
Xinyang Zhang
Yizhou Sun
Zhenyu Shi