Cite
Notes
Only stored in your browser.
Attribution
DeepSpeed-FastGen: High-throughput Text Generation for LLMs via MII and DeepSpeed-Inference
arXiv 2024
from 1 papers
Ammar Ahmad Awan
Arash Bakhtiari
Connor Holmes
Heyang Qin
Jeff Rasley
Lev Kurilenko
Masahiro Tanaka
Michael Wyatt
Samyam Rajbhandari
Yuxiong He