0

Disentangled Feature Importance

When predictors are statistically dependent, the appropriate definition of feature importance depends on the operational goal. Conditional-incremental measures are well-suited for feature selection, acquisition, and compression, where shared predictive information is treated as…

Preview
Year
2025
Hosting
Abstract onlyARXIV-DEFAULT

Cite

Notes

Only stored in your browser.

Attribution

Abstract & full text
arxiv.org/abs/2507.00260ARXIV-DEFAULT
TL;DR
Semantic Scholar
Attribution policy →

Abstract

When predictors are statistically dependent, the appropriate definition of feature importance depends on the operational goal. Conditional-incremental measures are well-suited for feature selection, acquisition, and compression, where shared predictive information is treated as redundancy. For post-hoc interpretation, however, the goal is often to attribute predictive signals across correlated measurement channels. We introduce Disentangled Feature Importance (DFI), a population-level attribution framework for this setting. DFI maps covariates to an independent latent representation under a specified entropic optimal transport geometry, computes latent importance, and attributes it back to the original covariates through barycentric sensitivities. We show that broad conditional-incremental FI functionals target conditional incremental predictive value under squared-error loss, and therefore answer a different question from attribution of shared predictive signal under dependence. Under fixed transport cost, reference law, and regularization level, DFI defines a well-specified family of estimands. Latent scores admit a functional ANOVA interpretation, and in the Gaussian linear case, the attributed DFI recovers the classical R^2 decomposition for correlated regressors. We derive influence-function-based inference under nuisance-rate and smoothness conditions, and show in simulations and an HIV-1 neutralization-resistance analysis that DFI yields stable, interpretable, uncertainty-quantified attributions of shared predictive signal.