William Brandon
- Papers
- 4
Cite
Notes
Only stored in your browser.
4papers
Authored papers
4Ladder-residual: parallelism-aware architecture for accelerating large model inference with communication overlapping
arXiv 2025
Fast Matrix Multiplications for Lookup Table-Quantized LLMs
arXiv 2024
Hydra: Sequentially-Dependent Draft Heads for Medusa Decoding
arXiv 2024
Striped Attention: Faster Ring Attention for Causal Transformers
arXiv 2023
Affiliations
No known affiliations.
Frequent co-authors
10from 4 papers