Aoxiong Yin

Cite

Notes

Only stored in your browser.

Attribution

4papers

Authored papers

Kimi-Audio Technical Report

arXiv 2025

The Best of Both Worlds: Integrating Language Models and Diffusion Models for Video Generation

ICCV 2025

MixSpeech: Cross-Modality Self-Learning with Audio-Visual Stream Mixup for Visual Speech Translation and Recognition

ICCV 2023 1

Distilling Coarse-to-Fine Semantic Matching Knowledge for Weakly Supervised 3D Visual Grounding

ICCV 2023 1

No known affiliations.

from 4 papers

Kai Shen

Linjun Li

Xinyu Zhou

Xize Cheng

Xu Tan

Yichong Leng

Zehan Wang

Zhou Zhao

Chu Wei

Ding Ding