0

Anytime PAC-Bayes for Constrained Density-Ratio Networks under Covariate Shift

A unified framework for learning under covariate shift is presented, in which a constrained density-ratio network approximates the Radon-Nikodym derivative $r^\star = dP/dQ$ and feeds an anytime PAC-Bayes generalization certificate.

Preview
Year
2026
Hosting
Full text hostedCC-BY-4.0

Cite

Notes

Only stored in your browser.

Attribution

Abstract & full text
arxiv.org/abs/2605.17212CC-BY-4.0
TL;DR
Semantic Scholar
Attribution policy →

Abstract

A unified framework for learning under covariate shift is presented, in which a constrained density-ratio network approximates the Radon-Nikodym derivative r^\star = dP/dQ and feeds an anytime PAC-Bayes generalization certificate. A change-of-measure identity decomposes the gap between target risk and importance-weighted source risk into a ratio-bias term governed by |r_θ- r^\star|{L^2(Q)} and a generalization-gap term governed by the variability of the weighted loss. Normalization and moment-matching identities are enforced as hard integral constraints through an augmented-Lagrangian scheme, with a second-moment penalty controlling the effective sample size. PAC-Bayes is instantiated on the weighted risk in a fixed-time regime that yields Bernoulli-KL bounds, identifies the network-weighted Gibbs posterior as the unique KL-regularized minimizer, and quantifies stability under L^2(Q) perturbations of the learned ratio, and is then strengthened by geometric peeling to an anytime certificate uniform in t \geq t{\min}. A pre-registered two-campaign protocol combining a patch test against analytic ground truth with a real-data deployment validates the framework: the network produces calibrated ratios, reduces target 0/1 loss against unweighted ERM and classical direct ratio-estimation baselines, and attains the anytime certificate. A single fixed-time coverage failure is recorded, with per-split coverage aligning one-to-one with the magnitude of the label shift, confirming that the covariate-only assumption is operationally tight rather than a defect of the certificate.