Cite
Notes
Only stored in your browser.
Attribution
SpeechCLIP: Integrating Speech with Pre-Trained Vision and Language Model
arXiv 2022
Why is Winoground Hard? Investigating Failures in Visuolinguistic Compositionality
from 2 papers
David Harwath
Anuj Diwan
Eunsol Choi
Heng-Jui Chang
Hsuan-Fu Wang
Hung-Yi Lee
Kyle Mahowald
Yi-Jen Shih