Hannaneh Hajishirzi
Professor of CS at UW and Senior Director of AI at Allen Institute for AI; co-leads OLMo and Tülu - the fully open language-model program.
- Role
- professor
- Currently at
- University of Washington
- twitter.com/HannaHajishirzi
- GitHub
- github.com/hannaneh
- Scholar
- scholar.google.com/citations
- Papers
- 68
Cite
Notes
Only stored in your browser.
Authored papers
68Meta-Reinforcement Learning with Self-Reflection for Agentic Search
arXiv 2026
Learning to Detect Language Model Training Data via Active Reconstruction
arXiv 2026
s1: Simple Test-Time Scaling
preprint
Olmo 3
arXiv 2025
OLMoTrace: Tracing Language Model Outputs Back to Trillions of Training Tokens
arXiv 2025
Infini-gram mini: Exact n-gram Search at the Internet Scale with FM-Index
arXiv 2025
DR Tulu: Reinforcement Learning with Evolving Rubrics for Deep Research
arXiv 2025
RLVE: Scaling Up Reinforcement Learning for Language Models with Adaptive Verifiable Environments
arXiv 2025
Spurious Rewards: Rethinking Training Signals in RLVR
arXiv 2025
FlexOlmo: Open Language Models for Flexible Data Use
arXiv 2025
EvalTree: Profiling Language Model Weaknesses via Hierarchical Capability Trees
arXiv 2025
SciArena: An Open Evaluation Platform for Foundation Models in Scientific Literature Tasks
arXiv 2025
Tulu 3: Pushing Frontiers in Open Language Model Post-Training
preprint
2 OLMo 2 Furious
arXiv 2024
OLMo: Accelerating the Science of Language Models
arXiv 2024
Molmo and PixMo: Open Weights and Open Data for State-of-the-Art Vision-Language Models
CVPR 2025 1
Dolma: an Open Corpus of Three Trillion Tokens for Language Model Pretraining Research
arXiv 2024
OLMoE: Open Mixture-of-Experts Language Models
arXiv 2024
RewardBench: Evaluating Reward Models for Language Modeling
arXiv 2024
Data Engineering for Scaling Language Models to 128K Context
arXiv 2024
OpenScholar: Synthesizing Scientific Literature with Retrieval-augmented LMs
arXiv 2024
SciRIFF: A Resource to Enhance Language Model Instruction-Following over Scientific Literature
arXiv 2024
Infini-gram: Scaling Unbounded n-gram Language Models to a Trillion Tokens
arXiv 2024
Do Membership Inference Attacks Work on Large Language Models?
arXiv 2024
Husky: A Unified, Open-Source Language Agent for Multi-Step Reasoning
arXiv 2024
Getting it Right: Improving Spatial Consistency in Text-to-Image Models
arXiv 2024
APT: Adaptive Pruning and Tuning Pretrained Language Models for Efficient Training and Inference
arXiv 2024
Hybrid Preferences: Learning to Route Instances for Human vs. AI Feedback
arXiv 2024
How Many Van Goghs Does It Take to Van Gogh? Finding the Imitation Threshold
arXiv 2024
HREF: Human Response-Guided Evaluation of Instruction Following in Language Models
arXiv 2024
Self-RAG: Learning to Retrieve, Generate, and Critique through Self-Reflection
arXiv 2023
DataComp: In search of the next generation of multimodal datasets
NeurIPS 2023 11
Personalized Soups: Personalized Large Language Model Alignment via Post-hoc Parameter Merging
arXiv 2023
FActScore: Fine-grained Atomic Evaluation of Factual Precision in Long Form Text Generation
arXiv 2023
BTR: Binary Token Representations for Efficient Retrieval Augmented Language Models
arXiv 2023
Crystal: Introspective Reasoners Reinforced with Self-Feedback
arXiv 2023
TaskWeb: Selecting Better Source Tasks for Multi-task NLP
arXiv 2023
SILO Language Models: Isolating Legal Risk In a Nonparametric Datastore
arXiv 2023
Vera: A General-Purpose Plausibility Estimation Model for Commonsense Statements
arXiv 2023
Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models
TMLR
Is Reinforcement Learning (Not) for Natural Language Processing: Benchmarks, Baselines, and Building Blocks for Natural Language Policy Optimization
arXiv 2022
Editing Models with Task Arithmetic
arXiv 2022
Super-NaturalInstructions: Generalization via Declarative Instructions on 1600+ NLP Tasks
arXiv 2022
Rethinking the Role of Demonstrations: What Makes In-Context Learning Work?
arXiv 2022
NaturalProver: Grounded Mathematical Proof Generation with Language Models
arXiv 2022
Rainier: Reinforced Knowledge Introspector for Commonsense Question Answering
arXiv 2022
Self-Instruct: Aligning Language Models with Self-Generated Instructions
arXiv 2022
When Not to Trust Language Models: Investigating Effectiveness of Parametric and Non-Parametric Memories
arXiv 2022
Nonparametric Masked Language Modeling
arXiv 2022
Task-aware Retrieval with Instructions
arXiv 2022
CREPE: Open-Domain Question Answering with False Presuppositions
arXiv 2022
MetaICL: Learning to Learn In Context
NAACL 2022 7
Generated Knowledge Prompting for Commonsense Reasoning
ACL 2022 5
MultiVerS: Improving scientific claim verification with weak supervision and full-document context
Findings (NAACL) 2022 7
Probing Across Time: What Does RoBERTa Know and When?
Findings (EMNLP) 2021 11
Prompt Waywardness: The Curious Case of Discretized Interpretation of Continuous Prompts
NAACL 2022 7
Robust fine-tuning of zero-shot models
robust-fine-tuning-of-zero-shot-models-1
Efficient Passage Retrieval with Hashing for Open-domain Question Answering
ACL 2021 5
GooAQ: Open Question Answering with Diverse Answer Types
Findings (EMNLP) 2021 11
NaturalProofs: Mathematical Theorem Proving in Natural Language
arXiv 2021
Dataset Cartography: Mapping and Diagnosing Datasets with Training Dynamics
EMNLP 2020 11
MedICaT: A Dataset of Medical Images, Captions, and Textual References
Findings of the Association for Computational Linguistics 2020
UnifiedQA: Crossing Format Boundaries With a Single QA System
Findings of the Association for Computational Linguistics 2020
DeLighT: Deep and Light-weight Transformer
delight-deep-and-light-weight-transformer
XOR QA: Cross-lingual Open-Retrieval Question Answering
NAACL 2021 4
Contextualized Sparse Representations for Real-Time Open-Domain Question Answering
contextualized-sparse-representations-for
Real-Time Open-Domain Question Answering with Dense-Sparse Phrase Index
real-time-open-domain-question-answering-with-1
A Diagram Is Worth A Dozen Images
arXiv 2016
Tool contributions
1Affiliations
Previously
Frequent co-authors
10from 68 papers
Luke Zettlemoyer
professor
Noah A. Smith
Sewon Min
Yejin Choi
professor
Luca Soldaini
Pang Wei Koh
Yizhong Wang
researcher
Ali Farhadi
CEO
Jiacheng Liu
Kyle Lo