Cite
Notes
Only stored in your browser.
Attribution
VATT: Transformers for Multimodal Self-Supervised Learning from Raw Video, Audio and Text
NeurIPS 2021 12
from 1 papers
Boqing Gong
Liangzhe Yuan
Rui Qian
Shih-Fu Chang
Wei-Hong Chuang
Yin Cui