Cite
Notes
Only stored in your browser.
Attribution
CLIPO: Contrastive Learning in Policy Optimization Generalizes RLVR
arXiv 2026
Search Self-play: Pushing the Frontier of Agent Capability without Supervision
arXiv 2025
Multi-CPR: A Multi Domain Chinese Dataset for Passage Retrieval
arXiv 2022
from 3 papers
Pengyu Cheng
Xiaoxi Jiang
Chutian Wang
Dingkun Long
Guangwei Xu
Guojun Zhang
Haonan Chen
Haotian Xu
Hongliang Lu
Jiajun Song