Invariance Pair Guidance: Robustness to Spurious Correlations via Corrective Gradients

Machine learning models are inherently bound to the distribution of the training data, often exploiting non-causal shortcuts. As a result, achieving robustness to spurious correlations remains a challenge. While existing approaches rely on data manipulation or re-weighting strategies to achieve robustness, they typically require dense group labels, multiple training domains, or specialized pre-processing. We propose Invariance Pair Guidance (IPG), a method to mitigate reliance on spurious correlations using a sparse set of counterfactual pairs. Unlike other methods demanding extensive supervision, IPG utilizes a novel dual-update mechanism to dynamically correct the optimization trajectory. We generate input pairs that isolate the spurious attribute to define the invariance, a characteristic that should not affect the outcome of the model. Based on these pairs, we define a corrective gradient that complements the traditional gradient descent approach. The correction adapts via a predefined invariance condition. Experiments on ColoredMNIST, Waterbirds-100, and CelebA datasets demonstrate the effectiveness of our approach and its robustness to group shifts, supported by a theoretical convergence analysis. IPG offers a data-efficient and theoretically grounded path to robustness.