Cite
Notes
Only stored in your browser.
Attribution
SilVar: Speech Driven Multimodal Model for Reasoning Visual Question Answering and Object Localization
arXiv 2024
from 1 papers
Chris Ngo
Phu-Vinh Nguyen
Tan-Hanh Pham
Truong-Son Hy