0

Fei-Fei Li

Stanford CS professor, co-founder of World Labs, co-director of Stanford HAI; creator of ImageNet and one of the defining figures of modern computer vision.

Role
professor
GitHub
Unknown
Papers
25

Cite

Notes

Only stored in your browser.

25papers

Authored papers

25

ESI-Bench: Towards Embodied Spatial Intelligence that Closes the Perception-Action Loop

arXiv 2026

2026

RAGEN-2: Reasoning Collapse in Agentic RL

arXiv 2026

2026

Theory of Space: Can Foundation Models Construct Spatial Beliefs through Active Exploration?

arXiv 2026

2026

Learning from Trials and Errors: Reflective Test-Time Planning for Embodied LLMs

arXiv 2026

2026

PointWorld: Scaling 3D World Models for In-The-Wild Robotic Manipulation

arXiv 2026

2026

s1: Simple Test-Time Scaling

preprint

2025

RAGEN: Understanding Self-Evolution in LLM Agents via Multi-Turn Reinforcement Learning

arXiv 2025

2025

Re-thinking Temporal Search for Long-Form Video Understanding

CVPR 2025 1

2025

BEHAVIOR Robot Suite: Streamlining Real-World Whole-Body Manipulation for Everyday Household Activities

arXiv 2025

2025

Exploring Diffusion Transformer Designs via Grafting

arXiv 2025

2025

ENACT: Evaluating Embodied Cognition with World Modeling of Egocentric Interaction

arXiv 2025

2025

Spatial Mental Modeling from Limited Views

arXiv 2025

2025

QuantiPhy: A Quantitative Benchmark Evaluating Physical Reasoning Abilities of Vision-Language Models

arXiv 2025

2025

Thinking in Space: How Multimodal Large Language Models See, Remember, and Recall Spaces

CVPR 2025 1

2024

Embodied Agent Interface: Benchmarking LLMs for Embodied Decision Making

arXiv 2024

2024

TRANSIC: Sim-to-Real Policy Transfer by Learning from Online Correction

arXiv 2024

2024

HourVideo: 1-Hour Video-Language Understanding

arXiv 2024

2024

Agent AI: Surveying the Horizons of Multimodal Interaction

arXiv 2024

2024

VoxPoser: Composable 3D Value Maps for Robotic Manipulation with Language Models

arXiv 2023

2023

ZeroNVS: Zero-Shot 360-Degree View Synthesis from a Single Image

CVPR 2024 1

2023

Mini-BEHAVIOR: A Procedurally Generated Benchmark for Long-horizon Decision-Making in Embodied AI

arXiv 2023

2023

VIMA: General Robot Manipulation with Multimodal Prompts

arXiv 2022

2022

Mind Your Outliers! Investigating the Negative Impact of Outliers on Active Learning for Visual Question Answering

ACL 2021 5

2021

Making Sense of Vision and Touch: Self-Supervised Learning of Multimodal Representations for Contact-Rich Tasks

arXiv 2018

2018

Inferring and Executing Programs for Visual Reasoning

inferring-and-executing-programs-for-visual-1

2017

Affiliations

Currently at

Stanford University

professor · university lab

Frequent co-authors

10

from 25 papers