Cite
Notes
Only stored in your browser.
Attribution
u-$μ$P: The Unit-Scaled Maximal Update Parametrization
arXiv 2024
Scalify: scale propagation for efficient low-precision LLM training
SparQ Attention: Bandwidth-Efficient LLM Inference
arXiv 2023
Unit Scaling: Out-of-the-Box Low-Precision Training
from 4 papers
Charlie Blake
Douglas Orr
Andres Felipe Cruz-Salinas
Andrew Fitzgibbon
Björn Deiseroth
Constantin Eichenberg
Ivan Chelombiev
Josef Dean
Luka Ribar
Lukas Balles