Cite
Notes
Only stored in your browser.
Attribution
BitMoD: Bit-serial Mixture-of-Datatype LLM Acceleration
arXiv 2024
LQER: Low-Rank Quantization Error Reconstruction for LLMs
Revisiting Block-based Quantisation: What is Important for Sub-8-bit LLM Inference?
arXiv 2023
from 3 papers
Cheng Zhang
Jianyi Cheng
Yiren Zhao
Ahmed F AbouElhamayed
Ilia Shumailov
Marta Andronic
Mohamed S. Abdelfattah
Xilai Dai
Yang Wang
Yuzong Chen