We analyse linear ensemble sampling (ES) with standard Gaussian perturbations in stochastic linear bandits. We show that for ensemble size m=Θ(d\log n), ES attains \tilde O(d^{3/2}\sqrt n) high-probability regret, closing the gap to the Thompson sampling benchmark while keeping computation comparable. The proof brings a new perspective on randomized exploration in linear bandits by reducing the analysis to a time-uniform exceedance problem for m independent Brownian motions. This continuous-time lens appears particularly natural here: it yields an exact representation of the relevant discrete-time processes, and we do not know another route to a sharp ES bound.
Sharp analysis of linear ensemble sampling
We analyse linear ensemble sampling (ES) with standard Gaussian perturbations in stochastic linear bandits. We show that for ensemble size $m=Θ(d\log n)$, ES attains $\tilde O(d^{3/2}\sqrt n)$ high-probability regret, closing the gap to the Thompson sampling benchmark while…
- Preview

- Year
- 2026
- Hosting
- Full text hostedCC-BY-4.0
Cite
Notes
Only stored in your browser.
Attribution
- Abstract & full text
- arxiv.org/abs/2602.08026CC-BY-4.0
- TL;DR
- Semantic Scholar