Cite
Notes
Only stored in your browser.
Attribution
HumanOmni: A Large Vision-Speech Language Model for Human-Centric Video Understanding
arXiv 2025
from 1 papers
Boyuan Sun
Detao Bai
Jiaxing Zhao
Liefeng Bo
Qize Yang
Shenghao Fu
Shimin Yao
Weixuan Chen
Xiang Chen
Xihan Wei