Pre-trained Summarization Distillation
Recent state-of-the-art approaches to summarization utilize large pre-trained Transformer models. Distilling these models to smaller student models has become critically important for practical use; however there are many different distillation methods proposed by the NLP…
- Year
- 2020
- Venue
- arXiv 2020
- Authors
- 2
- Hosting
- External sourcelicense unknown
Cite
Notes
Only stored in your browser.