Noah A. Smith
- Papers
- 60
Cite
Notes
Only stored in your browser.
Authored papers
60Meta-Reinforcement Learning with Self-Reflection for Agentic Search
arXiv 2026
Olmo 3
arXiv 2025
OLMoTrace: Tracing Language Model Outputs Back to Trillions of Training Tokens
arXiv 2025
Infini-gram mini: Exact n-gram Search at the Internet Scale with FM-Index
arXiv 2025
PointArena: Probing Multimodal Grounding Through Language-Guided Pointing
arXiv 2025
Bolmo: Byteifying the Next Generation of Language Models
arXiv 2025
FlexOlmo: Open Language Models for Flexible Data Use
arXiv 2025
MMMG: a Comprehensive and Reliable Evaluation Suite for Multitask Multimodal Generation
arXiv 2025
BLAB: Brutally Long Audio Bench
arXiv 2025
2 OLMo 2 Furious
arXiv 2024
Tulu 3: Pushing Frontiers in Open Language Model Post-Training
preprint
OLMo: Accelerating the Science of Language Models
arXiv 2024
Molmo and PixMo: Open Weights and Open Data for State-of-the-Art Vision-Language Models
CVPR 2025 1
Dolma: an Open Corpus of Three Trillion Tokens for Language Model Pretraining Research
arXiv 2024
OLMoE: Open Mixture-of-Experts Language Models
arXiv 2024
RewardBench: Evaluating Reward Models for Language Modeling
arXiv 2024
Tuning Language Models by Proxy
arXiv 2024
Hybrid Preferences: Learning to Route Instances for Human vs. AI Feedback
arXiv 2024
Data Mixture Inference: What do BPE Tokenizers Reveal about their Training Data?
arXiv 2024
Breaking the Curse of Multilinguality with Cross-lingual Expert Language Models
arXiv 2024
What's In My Big Data?
arXiv 2023
TIFA: Accurate and Interpretable Text-to-Image Faithfulness Evaluation with Question Answering
ICCV 2023 1
Vera: A General-Purpose Plausibility Estimation Model for Commonsense Statements
arXiv 2023
Scaling Expert Language Models with Unsupervised Domain Discovery
arXiv 2023
SILO Language Models: Isolating Legal Risk In a Nonparametric Datastore
arXiv 2023
Time is Encoded in the Weights of Finetuned Language Models
arXiv 2023
We're Afraid Language Models Aren't Modeling Ambiguity
arXiv 2023
In-context Pretraining: Language Modeling Beyond Document Boundaries
arXiv 2023
How Language Model Hallucinations Can Snowball
arXiv 2023
Summarization-Based Document IDs for Generative Retrieval with Language Models
arXiv 2023
Super-NaturalInstructions: Generalization via Declarative Instructions on 1600+ NLP Tasks
arXiv 2022
Self-Instruct: Aligning Language Models with Self-Generated Instructions
arXiv 2022
One Embedder, Any Task: Instruction-Finetuned Text Embeddings
arXiv 2022
UnifiedSKG: Unifying and Multi-Tasking Structured Knowledge Grounding with Text-to-Text Language Models
arXiv 2022
Measuring and Narrowing the Compositionality Gap in Language Models
arXiv 2022
Modeling Context With Linear Attention for Scalable Document-Level Translation
arXiv 2022
PromptCap: Prompt-Guided Task-Aware Image Captioning
arXiv 2022
Selective Annotation Makes Language Models Better Few-Shot Learners
arXiv 2022
WANLI: Worker and AI Collaboration for Natural Language Inference Dataset Creation
arXiv 2022
RealTime QA: What's the Answer Right Now?
realtime-qa-what-s-the-answer-right-now
Branch-Train-Merge: Embarrassingly Parallel Training of Expert Language Models
arXiv 2022
A Call for Clarity in Beam Search: How It Works and When It Stops
arXiv 2022
In-Context Learning for Few-Shot Dialogue State Tracking
arXiv 2022
Transparency Helps Reveal When Language Models Learn Meaning
arXiv 2022
Binding Language Models in Symbolic Languages
arXiv 2022
Train Short, Test Long: Attention with Linear Biases Enables Input Length Extrapolation
train-short-test-long-attention-with-linear-1
DExperts: Decoding-Time Controlled Text Generation with Experts and Anti-Experts
ACL 2021 5
A Dataset of Information-Seeking Questions and Answers Anchored in Research Papers
NAACL 2021 4
DEMix Layers: Disentangling Domains for Modular Language Modeling
NAACL 2022 7
NeuroLogic A*esque Decoding: Constrained Text Generation with Lookahead Heuristics
NAACL 2022 7
Probing Across Time: What Does RoBERTa Know and When?
Findings (EMNLP) 2021 11
Challenges in Automated Debiasing for Toxic Language Detection
EACL 2021 2
Dataset Cartography: Mapping and Diagnosing Datasets with Training Dynamics
EMNLP 2020 11
Don't Stop Pretraining: Adapt Language Models to Domains and Tasks
don-t-stop-pretraining-adapt-language-models-1
Shortformer: Better Language Modeling using Shorter Inputs
ACL 2021 5
Deep Encoder, Shallow Decoder: Reevaluating Non-autoregressive Machine Translation
deep-encoder-shallow-decoder-reevaluating-non
Knowledge Enhanced Contextual Word Representations
knowledge-enhanced-contextual-word-1
Dynamic Entity Representations in Neural Language Models
dynamic-entity-representations-in-neural-1
Transition-Based Dependency Parsing with Stack Long Short-Term Memory
transition-based-dependency-parsing-with-5
Retrofitting Word Vectors to Semantic Lexicons
retrofitting-word-vectors-to-semantic-1
Affiliations
Frequent co-authors
10from 60 papers
Hannaneh Hajishirzi
professor
Luke Zettlemoyer
professor
Yejin Choi
professor
Luca Soldaini
Yizhong Wang
researcher
Alisa Liu
researcher
Dirk Groeneveld
Jungo Kasai
Kyle Lo
Nathan Lambert
researcher