Cite
Notes
Only stored in your browser.
Attribution
A$^2$ATS: Retrieval-Based KV Cache Reduction via Windowed Rotary Position Embedding and Query-Aware Vector Quantization
arXiv 2025
from 1 papers
Chun Jason Xue
Junhui He
Nan Wang
Peng Zhou
Qiang Liu
Qingan Li
Rui Xu
Shangyu Wu