We present a transformer-based image anomaly detection and localization network. Our proposed model is a combination of a reconstruction-based approach and patch embedding. The use of transformer networks helps to preserve the spatial information of the embedded patches, which are later processed by a Gaussian mixture density network to localize the anomalous areas. In addition, we also publish BTAD, a real-world industrial anomaly dataset. Our results are compared with other state-of-the-art algorithms using publicly available datasets like MNIST and MVTec.
VT-ADL: A Vision Transformer Network for Image Anomaly Detection and Localization
A transformer-based image anomaly detection and localization network uses patch embedding and Gaussian mixture density network to identify and locate anomalies, outperforming other methods on datasets like MNIST and MVTec.
- Year
- 2021
- Venue
- arXiv 2021
- Authors
- 5
- Hosting
- Abstract onlyARXIV-DEFAULT
Cite
Notes
Only stored in your browser.
Attribution
- Abstract & full text
- arxiv.org/abs/2104.10036ARXIV-DEFAULT
- TL;DR
- Semantic Scholar