It has been recognized that the data generated by the denoising diffusion probabilistic model (DDPM) improves adversarial training. After two years of rapid development in diffusion models, a question naturally arises: can better diffusion models further improve adversarial training? This paper gives an affirmative answer by employing the most recent diffusion model which has higher efficiency ($\sim 20$ sampling steps) and image quality (lower FID score) compared with DDPM. Our adversarially trained models achieve state-of-the-art performance on RobustBench using only generated data (no external datasets). Under the $\ell_\infty$-norm threat model with $\epsilon=8/255$, our models achieve $70.69%$ and $42.67%$ robust accuracy on CIFAR-10 and CIFAR-100, respectively, i.e. improving upon previous state-of-the-art models by $+4.58%$ and $+8.03%$. Under the $\ell_2$-norm threat model with $\epsilon=128/255$, our models achieve $84.86%$ on CIFAR-10 ($+4.44%$). These results also beat previous works that use external data. We also provide compelling results on the SVHN and TinyImageNet datasets. Our code is available at https://github.com/wzekai99/DM-Improves-AT.
Better Diffusion Models Further Improve Adversarial Training
The recent diffusion model enhances adversarial training using only generated data, achieving top performance on RobustBench under various threat models.
- Year
- 2023
- Venue
- arXiv 2023
- Authors
- 6
- Hosting
- Abstract onlyARXIV-DEFAULT
Cite
Notes
Only stored in your browser.
Attribution
- Abstract & full text
- arxiv.org/abs/2302.04638v2ARXIV-DEFAULT
- TL;DR
- Semantic Scholar