Cite
Notes
Only stored in your browser.
Attribution
The Surprising Agreement Between Convex Optimization Theory and Learning-Rate Scheduling for Large Model Training
arXiv 2025
MoMo: Momentum Models for Adaptive Learning Rates
arXiv 2023
from 2 papers
Aaron Defazio
Adrien Taylor
Alexander Hägele
Francis Bach
Michael Eickenberg
Robert M. Gower
Ruben Ohana
Umut Simsekli