Cite
Notes
Only stored in your browser.
Attribution
SVDQuant: Absorbing Outliers by Low-Rank Components for 4-Bit Diffusion Models
arXiv 2024
QServe: W4A8KV4 Quantization and System Co-design for Efficient LLM Serving
Once-for-All: Train One Network and Specialize it for Efficient Deployment
arXiv 2019
from 3 papers
Song Han
Chuang Gan
Yujun Lin
Chenlin Meng
Enze Xie
Guangxuan Xiao
Han Cai
Haotian Tang
Jun-Yan Zhu
Junxian Guo