Suchin Gururangan
- Papers
- 12
Cite
Notes
Only stored in your browser.
Authored papers
12LESS: Selecting Influential Data for Targeted Instruction Tuning
arXiv 2024
Breaking the Curse of Multilinguality with Cross-lingual Expert Language Models
arXiv 2024
Language models scale reliably with over-training and on downstream tasks
arXiv 2024
AboutMe: Using Self-Descriptions in Webpages to Document the Effects of English Pretraining Data Filters
arXiv 2024
SILO Language Models: Isolating Legal Risk In a Nonparametric Datastore
arXiv 2023
Time is Encoded in the Weights of Finetuned Language Models
arXiv 2023
Scaling Expert Language Models with Unsupervised Domain Discovery
arXiv 2023
Editing Models with Task Arithmetic
arXiv 2022
Branch-Train-Merge: Embarrassingly Parallel Training of Expert Language Models
arXiv 2022
M2D2: A Massively Multi-domain Language Modeling Dataset
arXiv 2022
DEMix Layers: Disentangling Domains for Modular Language Modeling
NAACL 2022 7
Don't Stop Pretraining: Adapt Language Models to Domains and Tasks
don-t-stop-pretraining-adapt-language-models-1
Affiliations
Frequent co-authors
10from 12 papers
Noah A. Smith
Luke Zettlemoyer
professor
Margaret Li
Mike Lewis
Gabriel Ilharco
Hannaneh Hajishirzi
professor
Luca Soldaini
Ludwig Schmidt
professor
Mitchell Wortsman
Tim Althoff