Wentong Li

Papers: 10

Cite

Notes

Only stored in your browser.

Attribution

Affiliations & profile: Semantic Scholar

Attribution policy →

10papers

Authored papers

InstructSAM: Segment Any Instance with Any Instructions

arXiv 2026

2026

VisionTrim: Unified Vision Token Compression for Training-Free MLLM Acceleration

arXiv 2026

2026

Inst3D-LMM: Instance-Aware 3D Scene Understanding with Multi-modal Instruction Tuning

CVPR 2025 1

2025

Forging Spatial Intelligence: A Roadmap of Multi-Modal Data Pre-Training for Autonomous Systems

arXiv 2025

2025

EOC-Bench: Can MLLMs Identify, Recall, and Forecast Objects in an Egocentric World?

arXiv 2025

2025

Uncertainty-Instructed Structure Injection for Generalizable HD Map Construction

CVPR 2025 1

2025

TokenPacker: Efficient Visual Projector for Multimodal LLM

arXiv 2024

2024

VideoRefer Suite: Advancing Spatial-Temporal Object Understanding with Video LLM

CVPR 2025 1

2024

Osprey: Pixel Understanding with Visual Instruction Tuning

CVPR 2024 1

2023

H2RBox: Horizontal Box Annotation is All You Need for Oriented Object Detection

arXiv 2022

2022

Affiliations

No known affiliations.

Frequent co-authors

from 10 papers

Jianke Zhu

Song Wang

Yuqian Yuan

Junbo Chen

Wenqiao Zhang

Yueting Zhuang

Deli Zhao

Dongqi Tang

Hanxun Yu

Jian Liu