Cite
Notes
Only stored in your browser.
Attribution
Jakiro: Boosting Speculative Decoding with Decoupled Multi-Head via MoE
arXiv 2025
Gumiho: A Hybrid Architecture to Prioritize Early Tokens in Speculative Decoding
from 2 papers
Dong Li
Emad Barsoum
Haiduo Huang
Jinze Li
Xuanwu Yin
Edith C. H. Ngai
Fuwei Yang
Pengju Ren
Yang Liu
Zhenhua Liu