Alexey Tumanov
- Papers
- 4
Cite
Notes
Only stored in your browser.
4papers
Authored papers
4Taming Throughput-Latency Tradeoff in LLM Inference with Sarathi-Serve
arXiv 2024
Etalon: Holistic Performance Evaluation Framework for LLM Inference Systems
arXiv 2024
Initialization using Update Approximation is a Silver Bullet for Extremely Efficient Low-Rank Fine-Tuning
arXiv 2024
Ray: A Distributed Framework for Emerging AI Applications
arXiv 2017
Affiliations
No known affiliations.
Frequent co-authors
10from 4 papers