0

Asynchronous Stochastic Approximation with Applications to Average-Reward Reinforcement Learning

This paper investigates the stability and convergence properties of asynchronous stochastic approximation (SA) algorithms, with a focus on extensions relevant to average-reward reinforcement learning.

Year
2024
Hosting
Abstract onlyARXIV-DEFAULT

Cite

Notes

Only stored in your browser.

Attribution

Abstract & full text
arxiv.org/abs/2409.03915ARXIV-DEFAULT
TL;DR
Semantic Scholar
Attribution policy →

Abstract

This paper investigates the stability and convergence properties of asynchronous stochastic approximation (SA) algorithms, with a focus on extensions relevant to average-reward reinforcement learning. We first extend a stability proof method of Borkar and Meyn to accommodate more general noise conditions than previously considered, thereby yielding broader convergence guarantees for asynchronous SA. To sharpen the convergence analysis, we further examine the shadowing properties of asynchronous SA, building on a dynamical systems approach of Hirsch and Benaïm. These results provide a theoretical foundation for a class of relative value iteration-based reinforcement learning algorithms -- developed and analyzed in a companion paper -- for solving average-reward Markov and semi-Markov decision processes.