Heting Gao

Cite

Notes

Only stored in your browser.

Attribution

4papers

Authored papers

Omni-Diffusion: Unified Multimodal Understanding and Generation with Masked Discrete Diffusion

arXiv 2026

VITA-Audio: Fast Interleaved Cross-Modal Token Generation for Efficient Large Speech-Language Model

arXiv 2025

VITA-1.5: Towards GPT-4o Level Real-Time Vision and Speech Interaction

arXiv 2025

ContentVec: An Improved Self-Supervised Speech Representation by Disentangling Speakers

arXiv 2022

No known affiliations.

from 4 papers

Chaoyou Fu

Haoyu Cao

Xing Sun

Yunhang Shen

Zuwei Long

Caifeng Shan

Ke Li

Lijiang Li

Ran He

Rongrong Ji