RepBERT: Contextualized Text Embeddings for First-Stage Retrieval

RepBERT uses fixed-length contextualized embeddings for document and query representation, achieving state-of-the-art retrieval results on the MS MARCO Passage Ranking task with efficiency similar to bag-of-words methods.

Open

Preview
Year: 2020
Venue: arXiv 2020
ArXiv: arxiv.org/abs/2006.15498
Authors: 5
Hosting: Abstract onlyARXIV-DEFAULT

Cite

Notes

Only stored in your browser.

Attribution

Abstract & full text: arxiv.org/abs/2006.15498v2ARXIV-DEFAULT
TL;DR: Semantic Scholar

Attribution policy →

Abstract

Although exact term match between queries and documents is the dominant method to perform first-stage retrieval, we propose a different approach, called RepBERT, to represent documents and queries with fixed-length contextualized embeddings. The inner products of query and document embeddings are regarded as relevance scores. On MS MARCO Passage Ranking task, RepBERT achieves state-of-the-art results among all initial retrieval techniques. And its efficiency is comparable to bag-of-words methods.

Authors

Min Zhang Yiqun Liu Jiaxin Mao Jingtao Zhan Shaoping Ma