Wen-tau Yih
- Papers
- 29
Cite
Notes
Only stored in your browser.
Authored papers
29Anchored Decoding: Provably Reducing Copyright Risk for Any Language Model
arXiv 2026
ReasonIR: Training Retrievers for Reasoning Tasks
arXiv 2025
SelfCite: Self-Supervised Alignment for Context Attribution in Large Language Models
arXiv 2025
Data-Efficient Pretraining with Group-Level Data Influence Modeling
arXiv 2025
DR Tulu: Reinforcement Learning with Evolving Rubrics for Deep Research
arXiv 2025
Meta CLIP 2: A Worldwide Scaling Recipe
arXiv 2025
FlexOlmo: Open Language Models for Flexible Data Use
arXiv 2025
DRAMA: Diverse Augmentation from Large Language Models to Smaller Dense Retrievers
arXiv 2025
Mixture-of-Transformers: A Sparse and Scalable Architecture for Multi-Modal Foundation Models
arXiv 2024
OpenScholar: Synthesizing Scientific Literature with Retrieval-augmented LMs
arXiv 2024
CRAG -- Comprehensive RAG Benchmark
arXiv 2024
Memory Layers at Scale
arXiv 2024
Instruction-tuned Language Models are Better Knowledge Learners
arXiv 2024
Nearest Neighbor Speculative Decoding for LLM Generation and Attribution
arXiv 2024
FActScore: Fine-grained Atomic Evaluation of Factual Precision in Long Form Text Generation
arXiv 2023
LEVER: Learning to Verify Language-to-Code Generation with Execution
arXiv 2023
Expand, Rerank, and Retrieve: Query Reranking for Open-Domain Question Answering
arXiv 2023
Autoregressive Search Engines: Generating Substrings as Document Identifiers
arXiv 2022
One Embedder, Any Task: Instruction-Finetuned Text Embeddings
arXiv 2022
InCoder: A Generative Model for Code Infilling and Synthesis
arXiv 2022
DiffCSE: Difference-based Contrastive Learning for Sentence Embeddings
NAACL 2022 7
Nonparametric Masked Language Modeling
arXiv 2022
Task-aware Retrieval with Instructions
arXiv 2022
Improving Passage Retrieval with Zero-Shot Question Generation
arXiv 2022
Coder Reviewer Reranking for Code Generation
arXiv 2022
The Web Is Your Oyster -- Knowledge-Intensive NLP against a Very Large Web Corpus
arXiv 2021
Dense Passage Retrieval for Open-Domain Question Answering
EMNLP 2020 11
TaBERT: Pretraining for Joint Understanding of Textual and Tabular Data
arXiv 2020
Answering Complex Open-Domain Questions with Multi-Hop Dense Retrieval
ICLR 2021 1
Affiliations
Frequent co-authors
10from 29 papers
Luke Zettlemoyer
professor
Mike Lewis
Xi Victoria Lin
Hannaneh Hajishirzi
professor
Pang Wei Koh
Sewon Min
Barlas Oğuz
Patrick Lewis
Sebastian Riedel
Weijia Shi
researcher