Cite
Notes
Only stored in your browser.
Attribution
OLMo: Accelerating the Science of Language Models
arXiv 2024
Dolma: an Open Corpus of Three Trillion Tokens for Language Model Pretraining Research
Scalable Data Ablation Approximations for Language Models through Modular Training and Merging
from 3 papers
Emma Strubell
Ian Magnusson
Jesse Dodge
Aakanksha Naik
Abhilasha Ravichander
Akshita Bhagia
Crystal Nam
David Atkinson
Dirk Groeneveld
Dustin Schwenk