Cite
Notes
Only stored in your browser.
Attribution
video-SALMONN 2: Captioning-Enhanced Audio-Visual Large Language Models
arXiv 2025
ACVUBench: Audio-Centric Video Understanding Benchmark
video-SALMONN-o1: Reasoning-enhanced Audio-visual Large Language Model
from 3 papers
Changli Tang
Chao Zhang
Guangzhi Sun
Wei Li
Yixuan Li
Yudong Yang
Zejun Ma
Peihan Li
Yifan Jiang