Cite
Notes
Only stored in your browser.
Attribution
Exploiting Inter-Layer Expert Affinity for Accelerating Mixture-of-Experts Model Inference
arXiv 2024
Flover: A Temporal Fusion Framework for Efficient Autoregressive Model Parallel Inference
arXiv 2023
from 2 papers
Aamir Shafi
Dhabaleswar K.
Jinghan Yao
Panda
Nawras Alnaasan
Quentin Anthony
Tian Chen