Cite
Notes
Only stored in your browser.
Attribution
Mixture-of-Transformers: A Sparse and Scalable Architecture for Multi-Modal Foundation Models
arXiv 2024
Memory Layers at Scale
Byte Latent Transformer: Patches Scale Better Than Tokens
CiT: Curation in Training for Effective Vision-Language Data
ICCV 2023 1
from 4 papers
Luke Zettlemoyer
professor
Chunting Zhou
Lili Yu
Mike Lewis
Srinivasan Iyer
Wen-tau Yih
Ari Holtzman
Artidoro Pagnoni
Barlas Oğuz
Benjamin Muller