Jimmy Lin
- Papers
- 43
Cite
Notes
Only stored in your browser.
Authored papers
43Beyond Semantic Similarity: Rethinking Retrieval for Agentic Search via Direct Corpus Interaction
arXiv 2026
NanoKnow: How to Know What Your Language Model Knows
arXiv 2026
Rethinking Agentic Search with Pi-Serini: Is Lexical Retrieval Sufficient?
arXiv 2026
Tevatron 2.0: Unified Document Retrieval Toolkit across Scale, Language, and Modality
arXiv 2025
Rank-R1: Enhancing Reasoning in LLM-based Document Rerankers via Reinforcement Learning
arXiv 2025
DRAMA: Diverse Augmentation from Large Language Models to Smaller Dense Retrievers
arXiv 2025
Fixing Data That Hurts Performance: Cascading LLMs to Relabel Hard Negatives for Robust Information Retrieval
arXiv 2025
Document Screenshot Retrievers are Vulnerable to Pixel Poisoning Attacks
arXiv 2025
Conventional Contrastive Learning Often Falls Short: Improving Dense Retrieval with Cross-Encoder Listwise Distillation and Synthetic Data
arXiv 2025
Teaching Dense Retrieval Models to Specialize with Listwise Distillation and LLM Data Augmentation
arXiv 2025
BrowseComp-Plus: A More Fair and Transparent Evaluation Benchmark of Deep-Research Agent
arXiv 2025
Chatbot Arena Meets Nuggets: Towards Explanations and Diagnostics in the Evaluation of LLM Responses
arXiv 2025
CURE: A dataset for Clinical Understanding & Retrieval Evaluation
arXiv 2024
Ragnarök: A Reusable RAG Framework and Baselines for TREC 2024 Retrieval-Augmented Generation Track
arXiv 2024
Nearest Neighbor Speculative Decoding for LLM Generation and Attribution
arXiv 2024
UniRAG: Universal Retrieval Augmentation for Large Vision Language Models
arXiv 2024
MIRAGE-Bench: Automatic Multilingual Benchmark Arena for Retrieval-Augmented Generation Systems
arXiv 2024
PromptReps: Prompting Large Language Models to Generate Dense and Sparse Representations for Zero-Shot Document Retrieval
arXiv 2024
Resources for Brewing BEIR: Reproducible Reference Models and an Official Leaderboard
arXiv 2023
SLIM: Sparsified Late Interaction for Multi-Vector Retrieval with Inverted Indexes
arXiv 2023
RankZephyr: Effective and Robust Zero-Shot Listwise Reranking is a Breeze!
arXiv 2023
Leveraging LLMs for Synthesizing Training Data Across Many Languages in Multilingual Dense Retrieval
arXiv 2023
"Knowing When You Don't Know": A Multilingual Relevance Assessment Dataset for Robust Retrieval-Augmented Generation
arXiv 2023
GAIA Search: Hugging Face and Pyserini Interoperability for NLP Training Data Exploration
arXiv 2023
Found in the Middle: Permutation Self-Consistency Improves Listwise Ranking in Large Language Models
arXiv 2023
What Do Llamas Really Think? Revealing Preference Biases in Language Model Representations
arXiv 2023
HAGRID: A Human-LLM Collaborative Dataset for Generative Information-Seeking with Attribution
arXiv 2023
Spacerini: Plug-and-play Search Engines with Pyserini and Hugging Face
arXiv 2023
What the DAAM: Interpreting Stable Diffusion Using Cross Attention
arXiv 2022
Precise Zero-Shot Dense Retrieval without Relevance Labels
arXiv 2022
Aggretriever: A Simple Approach to Aggregate Textual Representations for Robust Dense Passage Retrieval
arXiv 2022
Making a MIRACL: Multilingual Information Retrieval Across a Continuum of Languages
arXiv 2022
Efficiently Teaching an Effective Dense Retriever with Balanced Topic Aware Sampling
arXiv 2021
Improving Query Representations for Dense Retrieval with Pseudo Relevance Feedback: A Reproducibility Study
arXiv 2021
Mr. TyDi: A Multi-lingual Benchmark for Dense Retrieval
EMNLP (MRL) 2021 11
DeeBERT: Dynamic Early Exiting for Accelerating BERT Inference
deebert-dynamic-early-exiting-for-1
Howl: A Deployed, Open-Source Wake Word Detection System
EMNLP (NLPOSS) 2020 11
The Archives Unleashed Project: Technology, Process, and Community to Improve Scholarly Access to Web Archives
arXiv 2020
Showing Your Work Doesn't Always Work
showing-your-work-doesn-t-always-work-1
Inserting Information Bottlenecks for Attribution in Transformers
Findings of the Association for Computational Linguistics 2020
DocBERT: BERT for Document Classification
arXiv 2019
End-to-End Open-Domain Question Answering with BERTserini
end-to-end-open-domain-question-answering-1
Deep Residual Learning for Small-Footprint Keyword Spotting
arXiv 2017
Affiliations
Frequent co-authors
10from 43 papers
Xueguang Ma
grad-student
Nandan Thakur
Raphael Tang
Xinyu Zhang
Shengyao Zhuang
Sahel Sharifymoghaddam
Ehsan Kamalloo
Guido Zuccon
Odunayo Ogundepo
Akintunde Oladipo