Cite
Notes
Only stored in your browser.
Attribution
SigLIP 2: Multilingual Vision-Language Encoders with Improved Semantic Understanding, Localization, and Dense Features
arXiv 2025
Simple Open-Vocabulary Object Detection with Vision Transformers
arXiv 2022
from 2 papers
Xiao Wang
Xiaohua Zhai
Alexey Dosovitskiy
Andreas Steiner
Anurag Arnab
Aravindh Mahendran
Austin Stone
Basil Mustafa
Dirk Weissenborn
Ibrahim Alabdulmohsin