Cite
Notes
Only stored in your browser.
Attribution
ZClip: Adaptive Spike Mitigation for LLM Pre-Training
arXiv 2025
A Refined Analysis of Massive Activations in LLMs
Variance Control via Weight Rescaling in LLM Pre-training
from 3 papers
Fabian Güra
Louis Owen
Nilabhra Roy Chowdhury