Cite
Notes
Only stored in your browser.
Attribution
Tree Attention: Topology-aware Decoding for Long-Context Attention on GPU clusters
arXiv 2024
from 1 papers
Beren Millidge
Jonathan Pilault
Quentin Anthony
Vasudev Shyam