0

Efficient Attentions for Long Document Summarization

Hepos, a novel attention mechanism, enables efficient processing of long documents in summarization tasks, surpassing existing models with higher ROUGE scores and more informative summaries.

Year
2021
Venue
NAACL 2021 4
Authors
5
Hosting
Abstract onlyARXIV-DEFAULT

Cite

Notes

Only stored in your browser.

Attribution

Abstract & full text
arxiv.org/abs/2104.02112v2ARXIV-DEFAULT
TL;DR
Semantic Scholar
Attribution policy →

Abstract

The quadratic computational and memory complexities of large Transformers have limited their scalability for long document summarization. In this paper, we propose Hepos, a novel efficient encoder-decoder attention with head-wise positional strides to effectively pinpoint salient information from the source. We further conduct a systematic study of existing efficient self-attentions. Combined with Hepos, we are able to process ten times more tokens than existing models that use full attentions. For evaluation, we present a new dataset, GovReport, with significantly longer documents and summaries. Results show that our models produce significantly higher ROUGE scores than competitive comparisons, including new state-of-the-art results on PubMed. Human evaluation also shows that our models generate more informative summaries with fewer unfaithful errors.

Authors

5