Cite
Notes
Only stored in your browser.
Attribution
LiteFrame: Efficient Vision Encoders Unlock Frame Scaling in Video LLMs
arXiv 2026
SigLIP 2: Multilingual Vision-Language Encoders with Improved Semantic Understanding, Localization, and Dense Features
arXiv 2025
from 2 papers
Alexey Gritsenko
Andreas Steiner
Basil Mustafa
Bohyung Han
Boqing Gong
Danfeng Qin
Deqing Sun
Ibrahim Alabdulmohsin
Jeremiah Harmsen
JiHwan Kim