Lequn Chen

Cite

Notes

Only stored in your browser.

Attribution

3papers

Authored papers

FlashInfer: Efficient and Customizable Attention Engine for LLM Inference Serving

arXiv 2025

Punica: Multi-Tenant LoRA Serving

arXiv 2023

Atom: Low-bit Quantization for Efficient and Accurate LLM Serving

arXiv 2023

No known affiliations.

from 3 papers

Arvind Krishnamurthy

Luis Ceze

Zihao Ye

Baris Kasikci

Tianqi Chen

Chien-Yu Lin

Danyang Zhuo

Kan Zhu

Ruihang Lai

Size Zheng