Haoyuan Li
- Papers
- 8
Cite
Notes
Only stored in your browser.
Authored papers
8Matrix-3D: Omnidirectional Explorable 3D World Generation
arXiv 2025
HealthGPT: A Medical Large Vision-Language Model for Unifying Comprehension and Generation via Heterogeneous Knowledge Adaptation
arXiv 2025
Streaming Video Question-Answering with In-context Video KV-Cache Retrieval
arXiv 2025
Fast-Slow Thinking for Large Vision-Language Model Reasoning
arXiv 2025
LLaVA-MoD: Making LLaVA Tiny via MoE Knowledge Distillation
arXiv 2024
EMOVA: Empowering Language Models to See, Hear and Speak with Vivid Emotions
CVPR 2025 1
HyperLLaVA: Dynamic Visual and Language Expert Tuning for Multimodal Large Language Models
arXiv 2024
Align$^2$LLaVA: Cascaded Human and Large Language Model Preference Alignment for Multi-modal Instruction Curation
arXiv 2024
Affiliations
Frequent co-authors
10from 8 papers