Bridging the Gap Between Target Networks and Functional Regularization

Bootstrapping is behind much of the successes of Deep Reinforcement Learning. However, learning the value function via bootstrapping often leads to unstable training due to fast-changing target values.

Open

Year: 2022
ArXiv: arxiv.org/abs/2210.12282
URL: arxiv.org/abs/2210.12282v2
Hosting: External sourcelicense unknown

Cite

Notes

Only stored in your browser.

Attribution

Abstract & full text: arxiv.org/abs/2210.12282v2
TL;DR: Semantic Scholar

Attribution policy →