Cite
Notes
Only stored in your browser.
Attribution
On Path to Multimodal Generalist: General-Level and General-Bench
arXiv 2025
SwimVG: Step-wise Multimodal Fusion and Adaption for Visual Grounding
Multi-Stage Vision Token Dropping: Towards Efficient Multimodal Large Language Model
arXiv 2024
from 3 papers
Quanjun Yin
Richang Hong
Ting Liu
Yue Hu
Bobo Li
Daoan Zhang
Hanwang Zhang
Hao Fei
Haobo Yuan
Jiahao Meng