Mean-Field Model for Two-Layer Neural Networks Trained with Consensus-Based Optimization

We study Consensus-Based Optimization (CBO) for two-layer neural network training. We compare the performance of CBO against Adam on two test cases and demonstrate how a hybrid approach, combining CBO with Adam, provides faster convergence than CBO. Additionally, in the context of multi-task learning, we recast CBO into a formulation that offers less memory overhead. The CBO method allows for a mean-field model formulation, which we couple with the mean-field model of the neural network. To this end, we first reformulate CBO within the optimal transport framework. As the number of particles tends to infinity, we lift the corresponding dynamics to the Wasserstein-over-Wasserstein space and show that the variance decreases monotonically. We confirm numerically that both mean-field models converge.