We present methodology for using dynamic evaluation to improve neural sequence models. Models are adapted to recent history via a gradient descent based mechanism, causing them to assign higher probabilities to re-occurring sequential patterns. Dynamic evaluation outperforms existing adaptation approaches in our comparisons. Dynamic evaluation improves the state-of-the-art word-level perplexities on the Penn Treebank and WikiText-2 datasets to 51.1 and 44.3 respectively, and the state-of-the-art character-level cross-entropies on the text8 and Hutter Prize datasets to 1.19 bits/char and 1.08 bits/char respectively.
Dynamic Evaluation of Neural Sequence Models
Dynamic evaluation enhances neural sequence models by adapting to recent data, improving word-level perplexity and character-level cross-entropy on various datasets.
- Year
- 2017
- Venue
- dynamic-evaluation-of-neural-sequence-models-1
- Authors
- 4
- Hosting
- Abstract onlyARXIV-DEFAULT
Cite
Notes
Only stored in your browser.
Attribution
- Abstract & full text
- arxiv.org/abs/1709.07432v2ARXIV-DEFAULT
- TL;DR
- Semantic Scholar