Nathan Lambert
Researcher at Ai2 leading post-training (Tulu, OLMo) and author of Interconnects, the most-read newsletter on RLHF / post-training and evals.
- Role
- researcher
- Currently at
- Allen Institute for AI (Ai2)
- twitter.com/natolambert
- GitHub
- github.com/natolambert
- Scholar
- scholar.google.com/citations
- Papers
- 17
Cite
Notes
Only stored in your browser.
Authored papers
17Meta-Reinforcement Learning with Self-Reflection for Agentic Search
arXiv 2026
RewardBench 2: Advancing Reward Model Evaluation
preprint
Reinforcement Learning from Human Feedback
arXiv 2025
Olmo 3
arXiv 2025
Spurious Rewards: Rethinking Training Signals in RLVR
arXiv 2025
Tulu 3: Pushing Frontiers in Open Language Model Post-Training
preprint
2 OLMo 2 Furious
arXiv 2024
OLMo: Accelerating the Science of Language Models
arXiv 2024
Molmo and PixMo: Open Weights and Open Data for State-of-the-Art Vision-Language Models
CVPR 2025 1
Dolma: an Open Corpus of Three Trillion Tokens for Language Model Pretraining Research
arXiv 2024
OLMoE: Open Mixture-of-Experts Language Models
arXiv 2024
RewardBench: Evaluating Reward Models for Language Modeling
arXiv 2024
WildGuard: Open One-Stop Moderation Tools for Safety Risks, Jailbreaks, and Refusals of LLMs
arXiv 2024
M-RewardBench: Evaluating Reward Models in Multilingual Settings
arXiv 2024
A Survey on Data Selection for Language Models
arXiv 2024
D2PO: Discriminator-Guided DPO with Response Evaluation Models
arXiv 2024
Reward Reports for Reinforcement Learning
arXiv 2022
Eval contributions
2Tool contributions
1Affiliations
Frequent co-authors
10from 17 papers
Hannaneh Hajishirzi
professor
Noah A. Smith
Jacob Morrison
research-engineer
Luca Soldaini
Dirk Groeneveld
Hamish Ivison
grad-student
Kyle Lo
Pete Walsh
Valentina Pyatkin
research-scientist
Akshita Bhagia