Cite
Notes
Only stored in your browser.
Attribution
Flash-LLM: Enabling Cost-Effective and Highly-Efficient Large Generative Model Inference with Unstructured Sparsity
arXiv 2023
from 1 papers
Donglin Zhuang
Haojun Xia
Shuaiwen Leon Song
Wei Lin
Xiafei Qiu
Yong Li
Yuchao Li
Zhongzhu Zhou