Cite
Notes
Only stored in your browser.
Attribution
MoA: Mixture of Sparse Attention for Automatic Large Language Model Compression
arXiv 2024
from 1 papers
Genghan Zhang
Guohao Dai
Haofeng Huang
Hongyi Wang
Huazhong Yang
Shengen Yan
Shiyao Li
Tianqi Wu
Tianyu Fu
Xuefei Ning