On the Interplay Between Stepsize Tuning and Progressive Sharpening

Recent empirical work has revealed an intriguing property of deep learning models by which the sharpness (largest eigenvalue of the Hessian) increases throughout optimization until it stabilizes around a critical value at which the optimizer operates at the edge of stability,…

Open

Year: 2023
ArXiv: arxiv.org/abs/2312.00209
URL: arxiv.org/abs/2312.00209v3
Hosting: External sourcelicense unknown

Cite

Notes

Only stored in your browser.

Attribution

Abstract & full text: arxiv.org/abs/2312.00209v3
TL;DR: Semantic Scholar

Attribution policy →