Cite
Notes
Only stored in your browser.
Attribution
GPT4Scene: Understand 3D Scenes from Videos with Vision-Language Models
arXiv 2025
Gemini vs GPT-4V: A Preliminary Comparison and Combination of Vision-Language Models Through Qualitative Cases
arXiv 2023
from 2 papers
Hengshuang Zhao
Jiaqi Wang
Ye Fang
Dahua Lin
Mengchen Zhang
Tong Wu
Zeyi Sun
Zhixiong Zhang
Ziwei Liu