Papers

Trending research and the full catalog - each paper linked to the benchmarks, methods, and models it introduces.

Filtered by domain: AgentsClear

AOHP: An Open-Source OS-Level Agent Harness for Personalized, Efficient and Secure Interaction

22 Jun 2026

AI agents are driving a new software paradigm, with the ability to autonomously call tools, extract information, manage memory, and complete tasks that span applications and data sources.

Agents

890.5/h

EnterpriseClawBench: Benchmarking Agents from Real Workplace Sessions

22 Jun 2026

Enterprise agents increasingly operate inside workspaces: they read heterogeneous files, invoke tools, and deliver business artifacts. We introduce EnterpriseClawBench, an enterprise agent benchmark constructed from proprietary, real-world agent sessions.

Agents

Constraint Tax in Open-Weight LLMs: An Empirical Study of Tool Calling Suppression Under Structured Output Constraints

24 Jun 2026

Tool Calling and Structured Output are two core capabilities of modern Agent systems, yet their interaction under joint deployment conditions remains insufficiently understood.

Agents Instruction Following Language Modeling