Cite
Notes
Only stored in your browser.
Attribution
Jakiro: Boosting Speculative Decoding with Decoupled Multi-Head via MoE
arXiv 2025
Gumiho: A Hybrid Architecture to Prioritize Early Tokens in Speculative Decoding
Nearly Lossless Adaptive Bit Switching
from 3 papers
Dong Li
Emad Barsoum
Jinze Li
Pengju Ren
Xuanwu Yin
Yixing Xu
Zhenhua Liu
Edith C. H. Ngai
Fuwei Yang
Tian Xia