SANIA: Polyak-type Optimization Framework Leads to Scale Invariant Stochastic Algorithms
Adaptive optimization methods are widely recognized as among the most popular approaches for training Deep Neural Networks (DNNs). Techniques such as Adam, AdaGrad, and AdaHessian utilize a preconditioner that modifies the search direction by incorporating information about the…
- Year
- 2023
- Venue
- arXiv 2023
- Authors
- 5
- Hosting
- External sourcelicense unknown
Cite
Notes
Only stored in your browser.