Cite
Notes
Only stored in your browser.
Attribution
OScaR: The Occam's Razor for Extreme KV Cache Quantization in LLMs and Beyond
arXiv 2026
Attention Sink in Transformers: A Survey on Utilization, Interpretation, and Mitigation
from 2 papers
Chao Zhang
Jing Xiong
Ngai Wong
Rui Yang
Wei Wu
Yifan Zhang
Yuchen Xie
Yulei Qian
Zunhai Su
Chaofan Tao