Cite
Notes
Only stored in your browser.
Attribution
Taming the Memory Footprint Crisis: System Design for Production Diffusion LLM Serving
arXiv 2025
HiRED: Attention-Guided Token Dropping for Efficient Inference of High-Resolution Vision-Language Models
arXiv 2024
from 2 papers
Bo Ji
Deepu John
Hans Vandierendonck
Jiakun Fan
JinYi Yoon
Kazi Hasan Ibn Arif
Xiangchen Li
Yanglin Zhang