Yushuo Guan

Cite

Notes

Only stored in your browser.

Attribution

4papers

Authored papers

LatentOmni: Rethinking Omni-Modal Understanding via Unified Audio-Visual Latent Reasoning

arXiv 2026

Semantic Routing: Exploring Multi-Layer LLM Feature Weighting for Diffusion Transformers

arXiv 2026

Mavors: Multi-granularity Video Representation for Multimodal Large Language Model

arXiv 2025

VidCapBench: A Comprehensive Benchmark of Video Captioning for Controllable Text-to-Video Generation

arXiv 2025

No known affiliations.

from 4 papers

Yuanxing Zhang

Bohan Zeng

Wentao Zhang

Bozhou Li

Di Zhang

Fuzheng Zhang

Jiaheng Liu

Pengfei Wan

Xinlong Chen

Yang Shi