Papers

Trending research and the full catalog - each paper linked to the benchmarks, methods, and models it introduces.

Filtered by domain: Face VerificationClear

VibeThinker-3B: Exploring the Frontier of Verifiable Reasoning in Small Language Models

15 Jun 2026

This technical report introduces VibeThinker-3B, a compact dense model with 3B parameters developed to investigate how far verifiable reasoning can be pushed within a strictly small-model regime.

Face Verification Language Modeling Reinforcement Learning

1.4k1.2/h

GateMem: Benchmarking Memory Governance in Multi-Principal Shared-Memory Agents

17 Jun 2026

Memory benchmarks for LLM agents largely assume single-user settings, leaving shared assistants for hospitals, workplaces, campuses, and households understudied. In these deployments, multiple principals write to a common memory pool and query it under different roles, scopes,…

Face Verification Language Modeling

1030.1/h

Agents' Last Exam

3 Jun 2026

Recent AI systems have achieved strong results on a wide range of benchmarks, yet these gains have not translated into economically meaningful deployment across many professional domains.

Agents Face Verification

7530.3/h