We present an algorithm for learning the governing equations of a stochastic dynamical system from trajectory data. It recovers interpretable symbolic expressions for both the drift b(x) and the diffusion a(x) in a single pass, yielding a model that can be queried directly for relaxation timescales, metastable escape rates, and stationary distributions. Rather than estimating the dynamics one time step at a time, the algorithm averages each candidate term across the whole trajectory before regressing; a drift-informed correction further removes the finite-sampling bias in the diffusion estimate, cutting it from 4.6% to 0.6% for state-dependent noise. We also show that the trajectory averaging must use a spatial rather than a temporal weighting: temporal weighting, as in existing weak-form methods, is biased for stochastic data with an error that grows with dataset size. On three benchmark systems -- the Ornstein--Uhlenbeck process, a double-well Langevin system, and a multiplicative-noise system -- the algorithm recovers all coefficients to within 5%, stationary densities to within 0.01 in total variation, and escape rates that match the true dynamics.
Data-Driven Weak-form Discovery of Stochastic Systems
We present an algorithm for learning the governing equations of a stochastic dynamical system from trajectory data. It recovers interpretable symbolic expressions for both the drift $b(x)$ and the diffusion $a(x)$ in a single pass, yielding a model that can be queried directly…
- Preview

- Year
- 2026
- Hosting
- Full text hostedCC-BY-4.0
Cite
Notes
Only stored in your browser.
Attribution
- Abstract & full text
- arxiv.org/abs/2603.20904CC-BY-4.0
- TL;DR
- Semantic Scholar