Yan Zhou
- Papers
- 8
Cite
Notes
Only stored in your browser.
Authored papers
8Stream-Omni: Simultaneous Multimodal Interactions with Large Language-Vision-Speech Model
arXiv 2025
LLaMA-Omni2: LLM-based Real-time Spoken Chatbot with Autoregressive Streaming Speech Synthesis
arXiv 2025
Can Multimodal Large Language Models Understand Spatial Relations?
arXiv 2025
UnityVideo: Unified Multi-Modal Multi-Task Learning for Enhancing World-Aware Video Generation
arXiv 2025
AgentAlign: Navigating Safety Alignment in the Shift from Informative to Agentic Large Language Models
arXiv 2025
LLaMA-Omni: Seamless Speech Interaction with Large Language Models
arXiv 2024
BayLing 2: A Multilingual Large Language Model with Efficient Language Alignment
arXiv 2024
DASpeech: Directed Acyclic Transformer for Fast and High-quality Speech-to-Speech Translation
daspeech-directed-acyclic-transformer-for
Affiliations
Frequent co-authors
10from 8 papers