0

How Accurately Can a Gaussian Approximate Stochastic Approximation Iterates?

Stochastic approximation (SA) is a method for finding the root of an operator perturbed by noise. The focus of this paper is studying the distribution of SA iterates in finite time.

Preview
Year
2026
Hosting
Full text hostedCC-BY-4.0

Cite

Notes

Only stored in your browser.

Attribution

Abstract & full text
arxiv.org/abs/2602.13906CC-BY-4.0
TL;DR
Semantic Scholar
Attribution policy →

Abstract

Stochastic approximation (SA) is a method for finding the root of an operator perturbed by noise. The focus of this paper is studying the distribution of SA iterates in finite time. In general, it is not possible to characterize the exact distribution, and therefore our goal is to find an approximation which can yield useful tail bounds. Inspired by the rich literature on the asymptotic normality of rescaled SA iterates, we approximate the pre-limit distributions by a sequence of Gaussians whose covariance is recursively defined. In particular, we establish explicit bounds on the Wasserstein-1 distance between the rescaled iterate at time $k$ and the aforementioned Gaussian for various choices of step-sizes. Since these covariances converge to the classical asymptotic limit, our analysis also provides a convergence rate for asymptotic normality as a by-product. As an immediate consequence of our bounds, we obtain tail bounds on the error of SA iterates at any time. Finally, we establish the sharpness of our rates by providing matching lower bounds and validate our findings through simulations. We obtain the sharp rates by first studying the convergence rate of the discrete Ornstein-Uhlenbeck (O-U) process driven by general noise, whose stationary distribution is identical to the limiting Gaussian distribution of the rescaled SA iterates. We believe that this is of independent interest, given its connection to sampling literature. The analysis involves adapting Stein's method for Gaussian approximation to handle the matrix weighted sum of i.i.d. random variables. The desired finite-time bounds for SA are obtained by characterizing the error dynamics between the rescaled SA iterate and the discrete time O-U process and combining it with the convergence rate of the latter process.