Cite
Notes
Only stored in your browser.
Attribution
Memorizing Transformers
memorizing-transformers
Self-attention Does Not Need $O(n^2)$ Memory
arXiv 2021
from 2 papers
Charles Staats
Christian Szegedy
DeLesley Hutchins
Yuhuai Wu