Cite
Notes
Only stored in your browser.
Attribution
Divide-or-Conquer? Which Part Should You Distill Your LLM?
arXiv 2024
Recurrent Drafter for Fast Speculative Decoding in Large Language Models
from 2 papers
Chong Wang
He Bai
Jiatao Gu
Navdeep Jaitly
researcher
VG Vinod Vydiswaran
Xuanyu Zhang
Yi Wang
Yizhe Zhang
Yunfei Cheng
Zhuofeng Wu