Cite
Notes
Only stored in your browser.
Attribution
From CLIP to DINO: Visual Encoders Shout in Multi-modal Large Language Models
arXiv 2023
from 1 papers
Dongsheng Jiang
Hao Zhang
professor
Hongkai Xiong
Jin Li
Songlin Liu
Xiaopeng Zhang
Yuchen Liu
Zhen Gao