We introduce PRIMERA, a pre-trained model for multi-document representation with a focus on summarization that reduces the need for dataset-specific architectures and large amounts of fine-tuning labeled data. PRIMERA uses our newly proposed pre-training objective designed to teach the model to connect and aggregate information across documents. It also uses efficient encoder-decoder transformers to simplify the processing of concatenated input documents. With extensive experiments on 6 multi-document summarization datasets from 3 different domains on zero-shot, few-shot and full-supervised settings, PRIMERA outperforms current state-of-the-art dataset-specific and pre-trained models on most of these settings with large margins. The code and pre-trained models can be found at \url{https://github.com/allenai/PRIMER}.
PRIMERA: Pyramid-based Masked Sentence Pre-training for Multi-document Summarization
PRIMERA, a pre-trained multi-document summarization model, uses a novel pre-training objective and efficient encoder-decoder transformers to achieve superior performance with minimal fine-tuning across various datasets.
- Year
- 2021
- Venue
- ACL 2022 5
- Authors
- 4
- Hosting
- Abstract onlyARXIV-DEFAULT
Cite
Notes
Only stored in your browser.
Attribution
- Abstract & full text
- arxiv.org/abs/2110.08499v2ARXIV-DEFAULT
- TL;DR
- Semantic Scholar