0

VT-ADL: A Vision Transformer Network for Image Anomaly Detection and Localization

A transformer-based image anomaly detection and localization network uses patch embedding and Gaussian mixture density network to identify and locate anomalies, outperforming other methods on datasets like MNIST and MVTec.

Year
2021
Venue
arXiv 2021
Authors
5
Hosting
Abstract onlyARXIV-DEFAULT

Cite

Notes

Only stored in your browser.

Attribution

Abstract & full text
arxiv.org/abs/2104.10036ARXIV-DEFAULT
TL;DR
Semantic Scholar
Attribution policy →

Abstract

We present a transformer-based image anomaly detection and localization network. Our proposed model is a combination of a reconstruction-based approach and patch embedding. The use of transformer networks helps to preserve the spatial information of the embedded patches, which are later processed by a Gaussian mixture density network to localize the anomalous areas. In addition, we also publish BTAD, a real-world industrial anomaly dataset. Our results are compared with other state-of-the-art algorithms using publicly available datasets like MNIST and MVTec.

Authors

5