Cite
Notes
Only stored in your browser.
Attribution
Beyond the Exploration-Exploitation Trade-off: A Hidden State Approach for LLM Reasoning in RLVR
arXiv 2025
from 1 papers
Faisal Nadeem Khan
Guanbo Huang
Jingyan Jiang
Qinting Jiang
Xiao Chen
Xiao Fan
Xiao Liang
Yi He
Zhi Wang