0

Non-vacuous Generalization Bounds for Deep Neural Networks without any modification to the trained models

Understanding and certifying the behavior of modern deep neural networks remains a fundamental challenge in reliable machine learning. We introduce a new class of data-dependent generalization bounds that apply directly to trained models, without any modification.

Preview
Year
2025
Hosting
Full text hostedCC-BY-SA-4.0

Cite

Notes

Only stored in your browser.

Attribution

Abstract & full text
arxiv.org/abs/2503.07325CC-BY-SA-4.0
TL;DR
Semantic Scholar
Attribution policy →

Abstract

Understanding and certifying the behavior of modern deep neural networks remains a fundamental challenge in reliable machine learning. We introduce a new class of data-dependent generalization bounds that apply directly to trained models, without any modification. In particular, we present an exactly computable bound that is non-vacuous across all evaluated networks, including ImageNet-scale models with 600M parameters. This this is the first work showing that meaningful generalization guarantees are achievable even for large, unaltered deep networks. Our approach reveals that generalization is governed by the interaction between the trained model and the geometry of the data distribution. We decompose the generalization error into two interpretable components: a distributional complexity term, capturing how the data mass is distributed across the input space, and local model-behavior terms, capturing the network's behavior within individual regions. This joint dependence identifies where and why generalization gaps arise. Empirically, some components of our bound are highly predictive of the true test error, and the bound tightens when the partition aligns with the intrinsic data geometry, highlighting data-dependent local regularity as a key driver of generalization.