0

Provable Accelerated Convergence of Nesterov's Momentum for Deep ReLU Neural Networks

Current state-of-the-art analyses on the convergence of gradient descent for training neural networks focus on characterizing properties of the loss landscape, such as the Polyak-Lojaciewicz (PL) condition and the restricted strong convexity.

Year
2023
Hosting
External sourcelicense unknown

Cite

Notes

Only stored in your browser.

Attribution

Abstract & full text
arxiv.org/abs/2306.08109v2
TL;DR
Semantic Scholar
Attribution policy →