Adam Roberts
- Papers
- 14
Cite
Notes
Only stored in your browser.
Authored papers
14The Flan Collection: Designing Data and Methods for Effective Instruction Tuning
arXiv 2023
UniMax: Fairer and more Effective Language Sampling for Large-Scale Multilingual Pretraining
arXiv 2023
Scaling Up Models and Data with $\texttt{t5x}$ and $\texttt{seqio}$
arXiv 2022
VeLO: Training Versatile Learned Optimizers by Scaling Up
arXiv 2022
What Language Model Architecture and Pretraining Objective Work Best for Zero-Shot Generalization?
arXiv 2022
Large Language Models Struggle to Learn Long-Tail Knowledge
arXiv 2022
Crosslingual Generalization through Multitask Finetuning
arXiv 2022
Multi-instrument Music Synthesis with Spectrogram Diffusion
arXiv 2022
Do Transformer Modifications Transfer Across Implementations and Applications?
EMNLP 2021 11
ByT5: Towards a token-free future with pre-trained byte-to-byte models
arXiv 2021
How Much Knowledge Can You Pack Into the Parameters of a Language Model?
EMNLP 2020 11
mT5: A massively multilingual pre-trained text-to-text transformer
NAACL 2021 4
DDSP: Differentiable Digital Signal Processing
ICLR 2020 1
Extracting Training Data from Large Language Models
arXiv 2020
Affiliations
Frequent co-authors
10from 14 papers
Colin Raffel
Hyung Won Chung
researcher
Sharan Narang
Noah Constant
Noam Shazeer
VP / co-lead Gemini
Yi Tay
founder
Aditya Barua
Albert Webson
Curtis Hawthorne
Eric Wallace