Cite
Notes
Only stored in your browser.
Attribution
Metadata Conditioning Accelerates Language Model Pre-training
arXiv 2025
LANISTR: Multimodal Learning from Structured and Unstructured Data
arXiv 2023
Attention is Not All You Need: Pure Attention Loses Rank Doubly Exponentially with Depth
arXiv 2021
from 3 papers
Alexander Wettig
researcher
Andreas Loukas
Danqi Chen
professor
Jean-Baptiste Cordonnier
Luxi He
Sadhika Malladi
Sayna Ebrahimi
Sercan O. Arik
Tianyu Gao
Tomas Pfister