Manling Li

RAGEN-2: Reasoning Collapse in Agentic RL

arXiv 2026

AutoResearch AI: Towards AI-Powered Research Automation for Scientific Discovery

arXiv 2026

Theory of Space: Can Foundation Models Construct Spatial Beliefs through Active Exploration?

arXiv 2026

Interactive Evaluation Requires a Design Science

arXiv 2026

Learning from Trials and Errors: Reflective Test-Time Planning for Embodied LLMs

arXiv 2026

RAGEN: Understanding Self-Evolution in LLM Agents via Multi-Turn Reinforcement Learning

arXiv 2025

Adaptation of Agentic AI

arXiv 2025

EmbodiedBench: Comprehensive Benchmarking Multi-modal Large Language Models for Vision-Driven Embodied Agents

arXiv 2025

Re-thinking Temporal Search for Long-Form Video Understanding

CVPR 2025 1

Exploring Diffusion Transformer Designs via Grafting

arXiv 2025

Bring Reason to Vision: Understanding Perception and Reasoning through Model Merging

arXiv 2025

LEMONADE: A Large Multilingual Expert-Annotated Abstractive Event Dataset for the Real World

arXiv 2025

ENACT: Evaluating Embodied Cognition with World Modeling of Egocentric Interaction

arXiv 2025

Spatial Mental Modeling from Limited Views

arXiv 2025

Chain-of-Experts: Unlocking the Communication Power of Mixture-of-Experts Models

arXiv 2025

CaptionQA: Is Your Caption as Useful as the Image Itself?

arXiv 2025

A Simple "Try Again" Can Elicit Multi-Turn LLM Reasoning

arXiv 2025

Why Is Spatial Reasoning Hard for VLMs? An Attention Mechanism Perspective on Focus Areas

arXiv 2025

HourVideo: 1-Hour Video-Language Understanding

arXiv 2024

Visually Descriptive Language Model for Vector Graphics Reasoning

arXiv 2024

MentalArena: Self-play Training of Language Models for Diagnosis and Treatment of Mental Health Disorders

arXiv 2024

Embodied Agent Interface: Benchmarking LLMs for Embodied Decision Making

arXiv 2024

IKEA Manuals at Work: 4D Grounding of Assembly Instructions on Internet Videos

arXiv 2024

Chain-of-Action: Faithful and Multimodal Question Answering through Large Language Models

arXiv 2024