mLongT5: A Multilingual and Efficient Text-To-Text Transformer for Longer Sequences

A multilingual text-to-text transformer called mLongT5, combining LongT5 with datasets from mT5 and UL2, outperforms existing models like mBART and M-BERT in summarization and question-answering tasks.

Open

Preview
Year: 2023
Venue: arXiv 2023
ArXiv: arxiv.org/abs/2305.11129
Authors: 4
Hosting: Abstract onlyARXIV-DEFAULT

Cite

Notes

Only stored in your browser.

Attribution

Abstract & full text: arxiv.org/abs/2305.11129v2ARXIV-DEFAULT
TL;DR: Semantic Scholar

Attribution policy →

Abstract

We present our work on developing a multilingual, efficient text-to-text transformer that is suitable for handling long inputs. This model, called mLongT5, builds upon the architecture of LongT5, while leveraging the multilingual datasets used for pretraining mT5 and the pretraining tasks of UL2. We evaluate this model on a variety of multilingual summarization and question-answering tasks, and the results show stronger performance for mLongT5 when compared to existing multilingual models such as mBART or M-BERT.

Authors

Joshua Ainslie Santiago Ontanon Mandy Guo David Uthus