Cite
Notes
Only stored in your browser.
Attribution
MMInference: Accelerating Pre-filling for Long-Context VLMs via Modality-Aware Permutation Sparse Attention
arXiv 2025
MInference 1.0: Accelerating Pre-filling for Long-Context LLMs via Dynamic Sparse Attention
arXiv 2024
from 2 papers
Amir H. Abdi
Chengruidong Zhang
Dongsheng Li
Huiqiang Jiang
Lili Qiu
Qianhui Wu
Xufang Luo
Yucheng Li
Yuqing Yang
Chin-Yew Lin