Mike Lewis
- Papers
- 25
Cite
Notes
Only stored in your browser.
Authored papers
25FlexOlmo: Open Language Models for Flexible Data Use
arXiv 2025
Mixture-of-Transformers: A Sparse and Scalable Architecture for Multi-Modal Foundation Models
arXiv 2024
Law of the Weakest Link: Cross Capabilities of Large Language Models
arXiv 2024
Byte Latent Transformer: Patches Scale Better Than Tokens
arXiv 2024
Efficient Streaming Language Models with Attention Sinks
arXiv 2023
FActScore: Fine-grained Atomic Evaluation of Factual Precision in Long Form Text Generation
arXiv 2023
Scaling Expert Language Models with Unsupervised Domain Discovery
arXiv 2023
In-context Pretraining: Language Modeling Beyond Document Boundaries
arXiv 2023
LLM.int8(): 8-bit Matrix Multiplication for Transformers at Scale
arXiv 2022
Measuring and Narrowing the Compositionality Gap in Language Models
arXiv 2022
InCoder: A Generative Model for Code Infilling and Synthesis
arXiv 2022
Coder Reviewer Reranking for Code Generation
arXiv 2022
Contrastive Decoding: Open-ended Text Generation as Optimization
arXiv 2022
Rethinking the Role of Demonstrations: What Makes In-Context Learning Work?
arXiv 2022
Nonparametric Masked Language Modeling
arXiv 2022
Improving Passage Retrieval with Zero-Shot Question Generation
arXiv 2022
Branch-Train-Merge: Embarrassingly Parallel Training of Expert Language Models
arXiv 2022
Questions Are All You Need to Train a Dense Passage Retriever
arXiv 2022
Train Short, Test Long: Attention with Linear Biases Enables Input Length Extrapolation
train-short-test-long-attention-with-linear-1
8-bit Optimizers via Block-wise Quantization
8-bit-optimizers-via-block-wise-quantization-1
MetaICL: Learning to Learn In Context
NAACL 2022 7
DEMix Layers: Disentangling Domains for Modular Language Modeling
NAACL 2022 7
Multilingual Denoising Pre-training for Neural Machine Translation
arXiv 2020
Shortformer: Better Language Modeling using Shorter Inputs
ACL 2021 5
Deal or No Deal? End-to-End Learning for Negotiation Dialogues
arXiv 2017
Affiliations
Frequent co-authors
10from 25 papers
Luke Zettlemoyer
professor
Noah A. Smith
Sewon Min
Wen-tau Yih
Hannaneh Hajishirzi
professor
Margaret Li
Ari Holtzman
Weijia Shi
researcher
Chunting Zhou
Daniel Fried
professor