Avner May
- Papers
- 4
Cite
Notes
Only stored in your browser.
4papers
Authored papers
4Sequoia: Scalable, Robust, and Hardware-aware Speculative Decoding
arXiv 2024
The Mamba in the Llama: Distilling and Accelerating Hybrid Models
arXiv 2024
MagicDec: Breaking the Latency-Throughput Tradeoff for Long Context Generation with Speculative Decoding
arXiv 2024
SpecExec: Massively Parallel Speculative Decoding for Interactive LLM Inference on Consumer Devices
arXiv 2024
Affiliations
No known affiliations.
Frequent co-authors
10from 4 papers