Yonghui Wang

Cite

Notes

Only stored in your browser.

Attribution

4papers

Authored papers

ROOT: VLM based System for Indoor Scene Understanding and Beyond

arXiv 2024

TextCoT: Zoom In for Enhanced Multimodal Text-Rich Image Understanding

arXiv 2024

AdaptVision: Dynamic Input Scaling in MLLMs for Versatile Scene Understanding

arXiv 2024

Towards Improving Document Understanding: An Exploration on Text-Grounding via MLLMs

arXiv 2023

No known affiliations.

from 4 papers

Houqiang Li

Wengang Zhou

Hao Feng

Bozhi Luan

Haoran Li

Hong Chen

Keyi Zhou

Shi-Yong Chen

Siyi Li

Zhenxing Zhou