Low-resource speech recognition has been long-suffering from insufficient training data. In this paper, we propose an approach that leverages neighboring languages to improve low-resource scenario performance, founded on the hypothesis that similar linguistic units in neighboring languages exhibit comparable term frequency distributions, which enables us to construct a Huffman tree for performing multilingual hierarchical Softmax decoding. This hierarchical structure enables cross-lingual knowledge sharing among similar tokens, thereby enhancing low-resource training outcomes. Empirical analyses demonstrate that our method is effective in improving the accuracy and efficiency of low-resource speech recognition.
Hierarchical Softmax for End-to-End Low-resource Multilingual Speech Recognition
Using Huffman trees for multilingual hierarchical Softmax decoding improves low-resource speech recognition by sharing knowledge across similar linguistic units.
- Year
- 2022
- Venue
- arXiv 2022
- Authors
- 11
- Hosting
- Abstract onlyARXIV-DEFAULT
Cite
Notes
Only stored in your browser.
Attribution
- Abstract & full text
- arxiv.org/abs/2204.03855v2ARXIV-DEFAULT
- TL;DR
- Semantic Scholar