Cite
Notes
Only stored in your browser.
Attribution
Pixel-SAIL: Single Transformer For Pixel-Grounded Understanding
arXiv 2025
DenseWorld-1M: Towards Detailed Dense Grounded Caption in the Real World
Are They the Same? Exploring Visual Correspondence Shortcomings of Multimodal LLMs
ICCV 2025
from 3 papers
Tao Zhang
Xiangtai Li
Jiashi Feng
Lu Qi
Shilin Xu
Shunping Ji
Yanwei Li
Yikang Zhou
Zilong Huang
Guang Shi