Large deep neural networks are powerful, but exhibit undesirable behaviors such as memorization and sensitivity to adversarial examples. In this work, we propose mixup, a simple learning principle to alleviate these issues. In essence, mixup trains a neural network on convex combinations of pairs of examples and their labels. By doing so, mixup regularizes the neural network to favor simple linear behavior in-between training examples. Our experiments on the ImageNet-2012, CIFAR-10, CIFAR-100, Google commands and UCI datasets show that mixup improves the generalization of state-of-the-art neural network architectures. We also find that mixup reduces the memorization of corrupt labels, increases the robustness to adversarial examples, and stabilizes the training of generative adversarial networks.
mixup: Beyond Empirical Risk Minimization
Mixup, a learning principle that uses convex combinations of training examples, enhances generalization, reduces memorization of corrupt labels, and improves robustness against adversarial examples.
- Year
- 2017
- Venue
- mixup-beyond-empirical-risk-minimization-1
- Authors
- 4
- Hosting
- Abstract onlyARXIV-DEFAULT
Cite
Notes
Only stored in your browser.
Attribution
- Abstract & full text
- arxiv.org/abs/1710.09412v2ARXIV-DEFAULT
- TL;DR
- Semantic Scholar