Attention mechanisms have improved the performance of NLP tasks while allowing models to remain explainable. Self-attention is currently widely used, however interpretability is difficult due to the numerous attention distributions. Recent work has shown that model representations can benefit from label-specific information, while facilitating interpretation of predictions. We introduce the Label Attention Layer: a new form of self-attention where attention heads represent labels. We test our novel layer by running constituency and dependency parsing experiments and show our new model obtains new state-of-the-art results for both tasks on both the Penn Treebank (PTB) and Chinese Treebank. Additionally, our model requires fewer self-attention layers compared to existing work. Finally, we find that the Label Attention heads learn relations between syntactic categories and show pathways to analyze errors.
Rethinking Self-Attention: Towards Interpretability in Neural Parsing
A Label Attention Layer improves NLP performance by incorporating label-specific information into self-attention, leading to state-of-the-art results in constituency and dependency parsing with fewer layers and enhanced interpretability.
- Year
- 2019
- Venue
- Findings of the Association for Computational Linguistics 2020
- Authors
- 6
- Hosting
- Abstract onlyARXIV-DEFAULT
Cite
Notes
Only stored in your browser.
Attribution
- Abstract & full text
- arxiv.org/abs/1911.03875v3ARXIV-DEFAULT
- TL;DR
- Semantic Scholar