Cite
Notes
Only stored in your browser.
Attribution
FIPO: Eliciting Deep Reasoning with Future-KL Influenced Policy Optimization
arXiv 2026
On the Direction of RLVR Updates for LLM Reasoning: Identification and Exploitation
from 2 papers
Bolin Ding
Chiyu Ma
Guoyin Wang
Jinda Lu
Jingren Zhou
Kexin Huang
Jiancan Wu
Junkang Wu
Shangshang Wang
Shuo Yang