Cite
Notes
Only stored in your browser.
Attribution
How Does Critical Batch Size Scale in Pre-training?
arXiv 2024
from 1 papers
Dean Foster
Depen Morwani
Difan Zou
HANLIN ZHANG
Jingfeng Wu
Nikhil Vyas
Sham Kakade