Shaoduo Gan

Cite

Notes

Only stored in your browser.

Attribution

1papers

Authored papers

SqueezeAttention: 2D Management of KV-Cache in LLM Inference via Layer-wise Optimal Budget

arXiv 2024

No known affiliations.

from 1 papers

Bin Cui

ZiHao Wang