Cite
Notes
Only stored in your browser.
Attribution
HumanOmni: A Large Vision-Speech Language Model for Human-Centric Video Understanding
arXiv 2025
HumanOmniV2: From Understanding to Omni-Modal Reasoning with Context
from 2 papers
Boyuan Sun
Detao Bai
Jiaxing Zhao
Qize Yang
Shenghao Fu
Shimin Yao
Xihan Wei
Bowen Yin
Jingren Zhou
Liefeng Bo