0

Pre-trained Summarization Distillation

Recent state-of-the-art approaches to summarization utilize large pre-trained Transformer models. Distilling these models to smaller student models has become critically important for practical use; however there are many different distillation methods proposed by the NLP…

Year
2020
Venue
arXiv 2020
Authors
2
Hosting
External sourcelicense unknown

Cite

Notes

Only stored in your browser.

Attribution

Abstract & full text
arxiv.org/abs/2010.13002v2
TL;DR
Semantic Scholar
Attribution policy →