Cite
Notes
Only stored in your browser.
Attribution
SigLIP 2: Multilingual Vision-Language Encoders with Improved Semantic Understanding, Localization, and Dense Features
arXiv 2025
TIPS: Text-Image Pretraining with Spatial Awareness
arXiv 2024
from 2 papers
Alexey Gritsenko
Andre Araujo
Andreas Steiner
Arjun Karpur
Basil Mustafa
Bingyi Cao
Dan Gnanapragasam
Daniel Salz
Guangxing Han
Howard Zhou