Cite
Notes
Only stored in your browser.
Attribution
AXLearn: Modular Large Model Training on Heterogeneous Infrastructure
arXiv 2025
FAdam: Adam is a natural gradient optimizer using diagonal empirical Fisher information
arXiv 2024
from 2 papers
BoWen Zhang
Chang Lan
Cheng Leong
Chung-Cheng Chiu
Danyang Zhuo
David Qiu
Floris Weers
Guoli Yin
Hanzhi Zhou
Haoshuo Huang