Provable Accelerated Convergence of Nesterov's Momentum for Deep ReLU Neural Networks

Current state-of-the-art analyses on the convergence of gradient descent for training neural networks focus on characterizing properties of the loss landscape, such as the Polyak-Lojaciewicz (PL) condition and the restricted strong convexity.

Open

Year: 2023
ArXiv: arxiv.org/abs/2306.08109
URL: arxiv.org/abs/2306.08109v2
Hosting: External sourcelicense unknown

Cite

Notes

Only stored in your browser.

Attribution

Abstract & full text: arxiv.org/abs/2306.08109v2
TL;DR: Semantic Scholar

Attribution policy →