0

Dynamic Evaluation of Neural Sequence Models

Dynamic evaluation enhances neural sequence models by adapting to recent data, improving word-level perplexity and character-level cross-entropy on various datasets.

Year
2017
Venue
dynamic-evaluation-of-neural-sequence-models-1
Authors
4
Hosting
Abstract onlyARXIV-DEFAULT

Cite

Notes

Only stored in your browser.

Attribution

Abstract & full text
arxiv.org/abs/1709.07432v2ARXIV-DEFAULT
TL;DR
Semantic Scholar
Attribution policy →

Abstract

We present methodology for using dynamic evaluation to improve neural sequence models. Models are adapted to recent history via a gradient descent based mechanism, causing them to assign higher probabilities to re-occurring sequential patterns. Dynamic evaluation outperforms existing adaptation approaches in our comparisons. Dynamic evaluation improves the state-of-the-art word-level perplexities on the Penn Treebank and WikiText-2 datasets to 51.1 and 44.3 respectively, and the state-of-the-art character-level cross-entropies on the text8 and Hutter Prize datasets to 1.19 bits/char and 1.08 bits/char respectively.

Authors

4