Cite
Notes
Only stored in your browser.
Attribution
KVQuant: Towards 10 Million Context Length LLM Inference with KV Cache Quantization
arXiv 2024
from 1 papers
Amir Gholami
Coleman Hooper
Hiva Mohammadzadeh
Kurt Keutzer
Michael W. Mahoney
Sehoon Kim