Tail Annealing for Heavy-Tailed Flow Matching

Standard generative models struggle with heavy-tailed data: Lipschitz architectures cannot produce power-law tails from Gaussian noise, and interpolating between heavy-tailed data and Gaussians is ill-posed. We propose a simple fix: apply the soft-log transform ϕ(x) = sign(x) \cdot \log(1 + |x|) coordinate-wise to data before training, then exponentiate samples after generation. A Hill diagnostic decides per-coordinate whether to transform, leaving light-tailed margins untouched at no added complexity. This compresses heavy tails into a range where standard flow matching succeeds, without heavy-tailed base distributions or architectural modifications. We provide theoretical intuition for why this works: the log-transform maps Pareto tails to exponentials, and the induced dynamics implement a form of tail annealing via power transformations. On a 144-configuration multivariate benchmark (3 copulas, d up to 100, 4 tail indices), Log-FM dominates specialized baselines on W_1, CVaR_{99}, and extreme-quantile metrics, and is the only method with zero severe divergences across 2{,}880 runs.