Improve Transformer Models with Better Relative Position Embeddings

Transformer architectures rely on explicit position encodings in order to preserve a notion of word order. In this paper, we argue that existing work does not fully utilize position information.

Open

Preview
Year: 2020
Venue: Findings of the Association for Computational Linguistics 2020
ArXiv: arxiv.org/abs/2009.13658
Authors: 4
Hosting: External sourcelicense unknown

Cite

Notes

Only stored in your browser.

Attribution

Abstract & full text: arxiv.org/abs/2009.13658v1
TL;DR: Semantic Scholar

Attribution policy →

Authors

Zhiheng Huang Davis Liang Peng Xu Bing Xiang