0

Do ImageNet Classifiers Generalize to ImageNet?

Evaluation on newly created test sets for CIFAR-10 and ImageNet reveals that current models generalize poorly to slightly more challenging images, leading to substantial accuracy drops.

Year
2019
Venue
NeurIPS Workshop ImageNet_PPF 2021 12
Authors
4
Hosting
Abstract onlyARXIV-DEFAULT

Cite

Notes

Only stored in your browser.

Attribution

Abstract & full text
arxiv.org/abs/1902.10811v2ARXIV-DEFAULT
TL;DR
Semantic Scholar
Attribution policy →

Abstract

We build new test sets for the CIFAR-10 and ImageNet datasets. Both benchmarks have been the focus of intense research for almost a decade, raising the danger of overfitting to excessively re-used test sets. By closely following the original dataset creation processes, we test to what extent current classification models generalize to new data. We evaluate a broad range of models and find accuracy drops of 3% - 15% on CIFAR-10 and 11% - 14% on ImageNet. However, accuracy gains on the original test sets translate to larger gains on the new test sets. Our results suggest that the accuracy drops are not caused by adaptivity, but by the models' inability to generalize to slightly "harder" images than those found in the original test sets.

Authors

4