Papers

Trending research and the full catalog - each paper linked to the benchmarks, methods, and models it introduces.

Filtered by domain: question-answeringClear

OPID: On-Policy Skill Distillation for Agentic Reinforcement Learning

25 Jun 2026

Outcome-based reinforcement learning provides a stable optimization backbone for language agents, but its sparse trajectory-level rewards provide little guidance on which intermediate decisions should be reinforced or suppressed.

Question Answering Reinforcement Learning

330.5/h

PhysBrain 1.0 Technical Report

Kai Chen, Bin Yu, Shijie Lian et al. · 14 May 2026

PhysBrain 1.0 leverages human egocentric video to generate physical commonsense supervision for vision-language-action models, achieving state-of-the-art performance in embodied control tasks through capability-preserving adaptation.

Image Understanding Question Answering Robotics

350.0/h

Harness-1: Reinforcement Learning for Search Agents with State-Externalizing Harnesses

1 Jun 2026

Search agents are often trained as policies over growing transcripts: the model must decide how to search while also remembering what it has seen, which evidence is useful, which constraints remain open, and which claims have actually been checked.

Question Answering Reinforcement Learning Retrieval

7850.2/h