Cite
Notes
Only stored in your browser.
Attribution
Rethinking Channel Dimensions to Isolate Outliers for Low-bit Weight Quantization of Large Language Models
arXiv 2023
LUT-GEMM: Quantized Matrix Multiplication based on LUTs for Efficient Inference in Large-Scale Generative Language Models
arXiv 2022
from 2 papers
Byeongwook Kim
Dongsoo Lee
Jeonghoon Kim
Se Jung Kwon
Baeseong Park
Gunho Park
Jung Hwan Heo
Minsub Kim
Sungjae Lee
Youngjoo Lee