Gangyan Zeng

Cite

Notes

Only stored in your browser.

Attribution

4papers

Authored papers

When Semantics Mislead Vision: Mitigating Large Multimodal Models Hallucinations in Scene Text Spotting and Understanding

arXiv 2025

VidText: Towards Comprehensive Evaluation for Video Text Understanding

arXiv 2025

Track the Answer: Extending TextVQA from Image to Video with Spatio-Temporal Clues

arXiv 2024

Focus, Distinguish, and Prompt: Unleashing CLIP for Efficient and Flexible Scene Text Retrieval

arXiv 2024

No known affiliations.

from 4 papers

Yu Zhou

Yan Zhang

Nicu Sebe

Yan Shu

Can Ma

Daiqing Wu

Dongbao Yang

Hangui Lin

Harry Yang

Huawen Shen