Yuejie Chi

Cite

Notes

Only stored in your browser.

Attribution

2papers

Authored papers

Get More with LESS: Synthesizing Recurrence with KV Cache Compression for Efficient LLM Inference

arXiv 2024

ShadowKV: KV Cache in Shadows for High-Throughput Long-Context LLM Inference

arXiv 2024

No known affiliations.

from 2 papers

Beidi Chen

Harry Dong

Hanshi Sun

Li-Wen Chang

Ningxin Zheng

Size Zheng

Wenlei Bao

Xin Liu

Xinyu Yang

Zhangyang Wang