Cite
Notes
Only stored in your browser.
Attribution
Few-Bit Backward: Quantized Gradients of Activation Functions for Memory Footprint Reduction
arXiv 2022
Memory-Efficient Backpropagation through Large Linear Layers
NAG-GS: Semi-Implicit, Accelerated and Robust Stochastic Optimizer
from 3 papers
Ivan Oseledets
Daniil Merkulov
Julia Gusak
Aleksandr Katrutsa
Aleksandr Mikhalev
Alex Shonenkov
Alexandr Katrutsa
Denis Dimitrov
Georgii Novikov
Olga Tsymboi