Cite
Notes
Only stored in your browser.
Attribution
SPIQA: A Dataset for Multimodal Question Answering on Scientific Papers
arXiv 2024
UniVTG: Towards Unified Video-Language Temporal Grounding
ICCV 2023 1
EgoVLPv2: Egocentric Video-Language Pre-training with Fusion in the Backbone
from 3 papers
Kevin Qinghong Lin
Mike Zheng Shou
Pengchuan Zhang
Rama Chellappa
Alex Jinpeng Wang
Difei Gao
Hardik Shah
Joya Chen
Rui Yan
Sayan Nag