Papers

Trending research and the full catalog - each paper linked to the benchmarks, methods, and models it introduces.

Filtered by domain: RoboticsClear

InSight: Self-Guided Skill Acquisition via Steerable VLAs

23 Jun 2026

Vision-language-action (VLA) models can learn manipulation skills from demonstrations, but their capabilities are bounded by the skills in the training data. We present InSight, a framework that unlocks autonomous skill acquisition by rendering VLAs steerable at the…

Image Understanding Robotics

150.0/h

LaWAM: Latent World Action Models for Efficient Dynamics-Aware Robot Policies

14 Jun 2026

Vision-Language-Action models (VLAs) leverage large-scale vision-language pretraining for semantic robot control, but often lack explicit foresight into how robot actions change the scene.

Robotics Video generation World Models

410.6/h

World-Language-Action Model for Unified World Modeling, Language Reasoning, and Action Synthesis

4 Jun 2026

We propose world-language-action (WLA) models as a new class of embodied foundation models. WLA takes textual instructions, images, and robot states as inputs to jointly predict textual subtasks, subgoal images, and robot actions, conjoining the \emph{world modeling interface}…

Robotics

980.1/h

Geometric Action Model for Robot Policy Learning

15 Jun 2026

Generalist robot policies must follow user instructions while reasoning about how objects, cameras, and robot actions interact in the 3D physical world. Recent vision-language-action models (VLAs) and video world-action models (WAMs) inherit strong semantic or temporal priors…

Robotics

960.0/h