0

Manling Li

Papers
29

Cite

Notes

Only stored in your browser.

Attribution

Affiliations & profile
Semantic Scholar
Attribution policy →
29papers

Authored papers

29

ESI-Bench: Towards Embodied Spatial Intelligence that Closes the Perception-Action Loop

arXiv 2026

2026

RAGEN-2: Reasoning Collapse in Agentic RL

arXiv 2026

2026

AutoResearch AI: Towards AI-Powered Research Automation for Scientific Discovery

arXiv 2026

2026

Theory of Space: Can Foundation Models Construct Spatial Beliefs through Active Exploration?

arXiv 2026

2026

Interactive Evaluation Requires a Design Science

arXiv 2026

2026

Learning from Trials and Errors: Reflective Test-Time Planning for Embodied LLMs

arXiv 2026

2026

RAGEN: Understanding Self-Evolution in LLM Agents via Multi-Turn Reinforcement Learning

arXiv 2025

2025

Adaptation of Agentic AI

arXiv 2025

2025

EmbodiedBench: Comprehensive Benchmarking Multi-modal Large Language Models for Vision-Driven Embodied Agents

arXiv 2025

2025

Re-thinking Temporal Search for Long-Form Video Understanding

CVPR 2025 1

2025

Exploring Diffusion Transformer Designs via Grafting

arXiv 2025

2025

Bring Reason to Vision: Understanding Perception and Reasoning through Model Merging

arXiv 2025

2025

LEMONADE: A Large Multilingual Expert-Annotated Abstractive Event Dataset for the Real World

arXiv 2025

2025

ENACT: Evaluating Embodied Cognition with World Modeling of Egocentric Interaction

arXiv 2025

2025

Spatial Mental Modeling from Limited Views

arXiv 2025

2025

Chain-of-Experts: Unlocking the Communication Power of Mixture-of-Experts Models

arXiv 2025

2025

CaptionQA: Is Your Caption as Useful as the Image Itself?

arXiv 2025

2025

A Simple "Try Again" Can Elicit Multi-Turn LLM Reasoning

arXiv 2025

2025

Why Is Spatial Reasoning Hard for VLMs? An Attention Mechanism Perspective on Focus Areas

arXiv 2025

2025

Embodied Agent Interface: Benchmarking LLMs for Embodied Decision Making

arXiv 2024

2024

HourVideo: 1-Hour Video-Language Understanding

arXiv 2024

2024

Chain-of-Action: Faithful and Multimodal Question Answering through Large Language Models

arXiv 2024

2024

IKEA Manuals at Work: 4D Grounding of Assembly Instructions on Internet Videos

arXiv 2024

2024

Visually Descriptive Language Model for Vector Graphics Reasoning

arXiv 2024

2024

MentalArena: Self-play Training of Language Models for Diagnosis and Treatment of Mental Health Disorders

arXiv 2024

2024

Word Embeddings Are Steers for Language Models

arXiv 2023

2023

HallE-Control: Controlling Object Hallucination in Large Multimodal Models

arXiv 2023

2023

Non-Sequential Graph Script Induction via Multimedia Grounding

arXiv 2023

2023

Multimedia Generative Script Learning for Task Planning

arXiv 2022

2022

Affiliations

No known affiliations.

Frequent co-authors

10

from 29 papers