Cite
Notes
Only stored in your browser.
Attribution
STEM: Scaling Transformers with Embedding Modules
arXiv 2026
Get More with LESS: Synthesizing Recurrence with KV Cache Compression for Efficient LLM Inference
arXiv 2024
ShadowKV: KV Cache in Shadows for High-Throughput Long-Context LLM Inference
from 3 papers
Beidi Chen
Yuejie Chi
Attiano Purpura-Pontoniere
Changsheng Zhao
Hanshi Sun
Li-Wen Chang
Ningxin Zheng
Ranajoy Sadhukhan
Sheng Cao
Size Zheng