Cite
Notes
Only stored in your browser.
Attribution
From CLIP to DINO: Visual Encoders Shout in Multi-modal Large Language Models
arXiv 2023
from 1 papers
Dongsheng Jiang
Hao Zhang
professor
Hongkai Xiong
Jin Li
Jin'e Zhao
Xiaopeng Zhang
Yuchen Liu
Zhen Gao