Cite
Notes
Only stored in your browser.
Attribution
Revisiting Text-to-Image Evaluation with Gecko: On Metrics, Prompts, and Human Ratings
arXiv 2024
Vid2Seq: Large-Scale Pretraining of a Visual Language Model for Dense Video Captioning
CVPR 2023 1
Connecting Vision and Language with Localized Narratives
ECCV 2020 8
from 3 papers
Aida Nematzadeh
Antoine Miech
Antoine Yang
Arsha Nagrani
Chris Knutsen
Chuhan Zhang
Cordelia Schmid
Cyrus Rashtchian
Emanuele Bugliarello
Isabela Albuquerque