Cite
Notes
Only stored in your browser.
Attribution
Scale Efficiently: Insights from Pre-training and Fine-tuning Transformers
arXiv 2021
Quantifying Attention Flow in Transformers
quantifying-attention-flow-in-transformers-1
Long Range Arena: A Benchmark for Efficient Transformers
arXiv 2020
from 3 papers
Donald Metzler
Jinfeng Rao
Mostafa Dehghani
Yi Tay
founder
Ashish Vaswani
Dani Yogatama
Dara Bahri
Hyung Won Chung
researcher
Liu Yang
Philip Pham