0

Finding Most Influential Sets

Identifying most influential sets (MIS) - size-$k$ subsets whose removal maximally changes a target estimand - is typically infeasible because it requires searching over $\binom{n}{k}$ subsets.

Preview
Year
2026
Hosting
Full text hostedCC-BY-SA-4.0

Cite

Notes

Only stored in your browser.

Attribution

Abstract & full text
arxiv.org/abs/2606.05919CC-BY-SA-4.0
TL;DR
Semantic Scholar
Attribution policy →

Abstract

Identifying most influential sets (MIS) - size-k subsets whose removal maximally changes a target estimand - is typically infeasible because it requires searching over \binom{n}{k} subsets. For estimands with linear-fractional leave-set-out effects, we show that MIS selection reduces to a one-parameter sequence of top-k problems. Dinkelbach's method yields an algorithm with O(n) cost per iteration and finite termination. For fixed residualized inputs, the algorithm returns a globally optimal set for the univariate ratio objective, including the oracle-residualized partial linear model. With estimated nuisance functions, uniform denominator and generated-score stability imply approximation to the first-order oracle orthogonal-score objective; exact set recovery follows under a separation condition. Simulations and applications show that the method recovers exact MIS that were previously computationally inaccessible.