Jimmy Lin

Rank-R1: Enhancing Reasoning in LLM-based Document Rerankers via Reinforcement Learning

arXiv 2025

BrowseComp-Plus: A More Fair and Transparent Evaluation Benchmark of Deep-Research Agent

arXiv 2025

Fixing Data That Hurts Performance: Cascading LLMs to Relabel Hard Negatives for Robust Information Retrieval

arXiv 2025

Document Screenshot Retrievers are Vulnerable to Pixel Poisoning Attacks

arXiv 2025

Conventional Contrastive Learning Often Falls Short: Improving Dense Retrieval with Cross-Encoder Listwise Distillation and Synthetic Data

arXiv 2025

DRAMA: Diverse Augmentation from Large Language Models to Smaller Dense Retrievers

arXiv 2025

Teaching Dense Retrieval Models to Specialize with Listwise Distillation and LLM Data Augmentation

arXiv 2025

Chatbot Arena Meets Nuggets: Towards Explanations and Diagnostics in the Evaluation of LLM Responses

arXiv 2025

CURE: A dataset for Clinical Understanding & Retrieval Evaluation

arXiv 2024

Nearest Neighbor Speculative Decoding for LLM Generation and Attribution

arXiv 2024

UniRAG: Universal Retrieval Augmentation for Large Vision Language Models

arXiv 2024

MIRAGE-Bench: Automatic Multilingual Benchmark Arena for Retrieval-Augmented Generation Systems

arXiv 2024

Ragnarök: A Reusable RAG Framework and Baselines for TREC 2024 Retrieval-Augmented Generation Track

arXiv 2024

PromptReps: Prompting Large Language Models to Generate Dense and Sparse Representations for Zero-Shot Document Retrieval

arXiv 2024

Resources for Brewing BEIR: Reproducible Reference Models and an Official Leaderboard

arXiv 2023

SLIM: Sparsified Late Interaction for Multi-Vector Retrieval with Inverted Indexes

arXiv 2023

RankZephyr: Effective and Robust Zero-Shot Listwise Reranking is a Breeze!

arXiv 2023

Spacerini: Plug-and-play Search Engines with Pyserini and Hugging Face

arXiv 2023

"Knowing When You Don't Know": A Multilingual Relevance Assessment Dataset for Robust Retrieval-Augmented Generation

arXiv 2023

GAIA Search: Hugging Face and Pyserini Interoperability for NLP Training Data Exploration

arXiv 2023

Found in the Middle: Permutation Self-Consistency Improves Listwise Ranking in Large Language Models

arXiv 2023

What Do Llamas Really Think? Revealing Preference Biases in Language Model Representations

arXiv 2023

Leveraging LLMs for Synthesizing Training Data Across Many Languages in Multilingual Dense Retrieval

arXiv 2023

HAGRID: A Human-LLM Collaborative Dataset for Generative Information-Seeking with Attribution

arXiv 2023

What the DAAM: Interpreting Stable Diffusion Using Cross Attention

arXiv 2022

Precise Zero-Shot Dense Retrieval without Relevance Labels

arXiv 2022

Making a MIRACL: Multilingual Information Retrieval Across a Continuum of Languages

arXiv 2022

Aggretriever: A Simple Approach to Aggregate Textual Representations for Robust Dense Passage Retrieval

arXiv 2022

Improving Query Representations for Dense Retrieval with Pseudo Relevance Feedback: A Reproducibility Study

arXiv 2021

2021

Mr. TyDi: A Multi-lingual Benchmark for Dense Retrieval

EMNLP (MRL) 2021 11

2021

Efficiently Teaching an Effective Dense Retriever with Balanced Topic Aware Sampling

arXiv 2021

2021

Showing Your Work Doesn't Always Work

showing-your-work-doesn-t-always-work-1

Inserting Information Bottlenecks for Attribution in Transformers

Findings of the Association for Computational Linguistics 2020

Howl: A Deployed, Open-Source Wake Word Detection System

EMNLP (NLPOSS) 2020 11

DeeBERT: Dynamic Early Exiting for Accelerating BERT Inference

deebert-dynamic-early-exiting-for-1

The Archives Unleashed Project: Technology, Process, and Community to Improve Scholarly Access to Web Archives

arXiv 2020