0

Is Spurious Correlation Removal Always Learnable?

Invariant learning can fail even when the invariant structure is statistically identifiable. We show a conditional computational barrier: under a black-box samplable supervised sparse recovery primitive motivated by average-case sparse-recovery reductions, there exist…

Preview
Year
2026
Hosting
Full text hostedCC-BY-4.0

Cite

Notes

Only stored in your browser.

Attribution

Abstract & full text
arxiv.org/abs/2606.12930CC-BY-4.0
TL;DR
Semantic Scholar
Attribution policy →

Abstract

Invariant learning can fail even when the invariant structure is statistically identifiable. We show a conditional computational barrier: under a black-box samplable supervised sparse recovery primitive motivated by average-case sparse-recovery reductions, there exist samplable multi-environment instances with a one-dimensional predictive invariant subspace (k=1) that are learnable with polynomial samples by exhaustive search, while any polynomial-time constant-accuracy recovery algorithm would contradict the primitive. We further quantify environment diversity by a separation parameter γ, which controls identifiability and the curvature of invariance objectives. Under sufficient diversity and local Gaussian regularity, the minimax risk is \mathbb{E}[\dist(\hat{V},V_{inv})^2]=Θ(k(d-k)/(n|E|)), and under label-induced shifts a phase transition occurs at n^*\propto k(d-k)/(|E|γ^2) with refined estimation error scaling proportional to 1/γ^2. Synthetic and real datasets illustrate the predicted gaps and transitions and motivate simple diversity diagnostics.