Harry Dong

Cite

Notes

Only stored in your browser.

Attribution

3papers

Authored papers

STEM: Scaling Transformers with Embedding Modules

arXiv 2026

Get More with LESS: Synthesizing Recurrence with KV Cache Compression for Efficient LLM Inference

arXiv 2024

ShadowKV: KV Cache in Shadows for High-Throughput Long-Context LLM Inference

arXiv 2024

No known affiliations.

from 3 papers

Beidi Chen

Yuejie Chi

Attiano Purpura-Pontoniere

Changsheng Zhao

Hanshi Sun

Li-Wen Chang

Ningxin Zheng

Ranajoy Sadhukhan

Sheng Cao

Size Zheng