Cite
Notes
Only stored in your browser.
Attribution
SOAP: Improving and Stabilizing Shampoo using Adam
arXiv 2024
How Does Critical Batch Size Scale in Pre-training?
Loss-to-Loss Prediction: Scaling Laws for All Datasets
from 3 papers
Sham Kakade
David Brandfonbrener
Depen Morwani
Dean Foster
Difan Zou
Eran Malach
HANLIN ZHANG
Itai Shapira
Jingfeng Wu
Lucas Janson