0

VaSST: Variational Inference for Symbolic Regression using Soft Symbolic Trees

Symbolic regression (SR) has gained recent traction in AI-driven scientific discovery for learning closed-form physical laws. Yet existing methods are dominated by heuristic search or data-intensive approaches that often assume low-noise regimes and lack principled uncertainty…

Preview
Year
2026
Hosting
Full text hostedCC-BY-4.0

Cite

Notes

Only stored in your browser.

Attribution

Abstract & full text
arxiv.org/abs/2602.23561CC-BY-4.0
TL;DR
Semantic Scholar
Attribution policy →

Abstract

Symbolic regression (SR) has gained recent traction in AI-driven scientific discovery for learning closed-form physical laws. Yet existing methods are dominated by heuristic search or data-intensive approaches that often assume low-noise regimes and lack principled uncertainty quantification, while fully probabilistic SR formulations remain scarce. We introduce a scalable probabilistic framework for SR, VaSST, based on variational inference. VaSST uses soft symbolic trees, a continuous relaxation of symbolic expression trees in which discrete operator and feature assignments are replaced by probability distributions over allowable components. This transforms combinatorial symbolic search through an astronomically large expression space into efficient gradient-based optimization while preserving a coherent probabilistic interpretation. The learned soft representations induce posterior distributions over symbolic structures, enabling uncertainty quantification across plausible symbolic forms through posterior-aware symbolic model selection. On simulated experiments and the Feynman Symbolic Regression Database, VaSST achieves strong structural recovery and predictive accuracy compared to state-of-the-art competing SR methods.