Mingwei Zhu

Cite

Notes

Only stored in your browser.

Attribution

4papers

Authored papers

ZoomEye: Enhancing Multimodal LLMs with Human-Like Zooming Capabilities through Tree-Based Image Exploration

arXiv 2024

GroundVLP: Harnessing Zero-shot Visual Grounding from Vision-Language Pre-training and Open-Vocabulary Object Detection

arXiv 2023

Benchmarking Sequential Visual Input Reasoning and Prediction in Multimodal Large Language Models

arXiv 2023

VL-CheckList: Evaluating Pre-trained Vision-Language Models with Objects, Attributes and Relations

arXiv 2022

No known affiliations.

from 4 papers

Jianwei Yin

Tiancheng Zhao

Haozhan Shen

Kangjia Zhao

Kyusong Lee

Leigang Sha

Ruochen Xu

Tianqi Zhang

Xiaopeng Lu

Yu Shu