0

Data-Driven Weak-form Discovery of Stochastic Systems

We present an algorithm for learning the governing equations of a stochastic dynamical system from trajectory data. It recovers interpretable symbolic expressions for both the drift $b(x)$ and the diffusion $a(x)$ in a single pass, yielding a model that can be queried directly…

Preview
Year
2026
Hosting
Full text hostedCC-BY-4.0

Cite

Notes

Only stored in your browser.

Attribution

Abstract & full text
arxiv.org/abs/2603.20904CC-BY-4.0
TL;DR
Semantic Scholar
Attribution policy →

Abstract

We present an algorithm for learning the governing equations of a stochastic dynamical system from trajectory data. It recovers interpretable symbolic expressions for both the drift b(x) and the diffusion a(x) in a single pass, yielding a model that can be queried directly for relaxation timescales, metastable escape rates, and stationary distributions. Rather than estimating the dynamics one time step at a time, the algorithm averages each candidate term across the whole trajectory before regressing; a drift-informed correction further removes the finite-sampling bias in the diffusion estimate, cutting it from 4.6% to 0.6% for state-dependent noise. We also show that the trajectory averaging must use a spatial rather than a temporal weighting: temporal weighting, as in existing weak-form methods, is biased for stochastic data with an error that grows with dataset size. On three benchmark systems -- the Ornstein--Uhlenbeck process, a double-well Langevin system, and a multiplicative-noise system -- the algorithm recovers all coefficients to within 5%, stationary densities to within 0.01 in total variation, and escape rates that match the true dynamics.