Yiyi Zhou
- Papers
- 8
Cite
Notes
Only stored in your browser.
Authored papers
8What Kind of Visual Tokens Do We Need? Training-free Visual Token Pruning for Multi-modal Large Language Models from the Perspective of Graph
arXiv 2025
Long-VITA: Scaling Large Multi-modal Models to 1 Million Tokens with Leading Short-Context Accuray
arXiv 2025
Grounded Chain-of-Thought for Multimodal Large Language Models
arXiv 2025
SVFR: A Unified Framework for Generalized Video Face Restoration
CVPR 2025 1
Feast Your Eyes: Mixture-of-Resolution Adaptation for Multimodal Large Language Models
arXiv 2024
FlashSloth: Lightning Multimodal Large Language Models via Embedded Visual Compression
arXiv 2024
Towards Efficient Diffusion-Based Image Editing with Instant Attention Masks
arXiv 2024
Accelerating Multimodal Large Language Models via Dynamic Visual-Token Exit and the Empirical Findings
arXiv 2024
Affiliations
Frequent co-authors
10from 8 papers