Cite
Notes
Only stored in your browser.
Attribution
SpecReason: Fast and Accurate Inference-Time Compute via Speculative Reasoning
arXiv 2025
Apparate: Rethinking Early Exits to Tame Latency-Throughput Tensions in ML Serving
arXiv 2023
from 2 papers
Ravi Netravali
Rui Pan
Anand Iyer
Gabriele Oliaro
Kai Li
Zhihao Jia
Zhihao Zhang