0

The boundary of neural network trainability is fractal

The boundary between stable and divergent hyperparameters in neural network training exhibits fractal properties similar to fractals like the Mandelbrot set.

Year
2024
Venue
arXiv 2024
Authors
1
Hosting
Abstract onlyARXIV-DEFAULT

Cite

Notes

Only stored in your browser.

Attribution

Abstract & full text
arxiv.org/abs/2402.06184ARXIV-DEFAULT
TL;DR
Semantic Scholar
Attribution policy →

Abstract

Some fractals -- for instance those associated with the Mandelbrot and quadratic Julia sets -- are computed by iterating a function, and identifying the boundary between hyperparameters for which the resulting series diverges or remains bounded. Neural network training similarly involves iterating an update function (e.g. repeated steps of gradient descent), can result in convergent or divergent behavior, and can be extremely sensitive to small changes in hyperparameters. Motivated by these similarities, we experimentally examine the boundary between neural network hyperparameters that lead to stable and divergent training. We find that this boundary is fractal over more than ten decades of scale in all tested configurations.

Authors

1