Classical data valuation defines a data point's value through the finite marginal contribution U(C\cup{i})-U(C), but estimating this quantity over coalitions requires repeated training and does not describe the contribution made along a stochastic training path. We ask whether marginal contributions of data points can be estimated from one coupled trajectory while retaining a verifiable relation to coalition-based values. To this end, we introduce Neural Dynamic Data Valuation (NDDV), which models each data point as a controlled stochastic state and computes a first-order marginal-contribution score via the adjoint equation of the Stochastic Maximum Principle (SMP). This raw sensitivity is then calibrated by a mass-preserving redistribution that increases one data point's participation while redistributing the same total weight over the remaining data points. We prove that the resulting backward adjoint recursion is the exact reverse-mode adjoint of the frozen-aggregate Euler system, bound its discrepancy from the mean-field sensitivity, and express each finite coalition marginal as an integral of local sample-weight sensitivities. These results yield pair-specific error bounds and sufficient conditions for ordering agreement with Shapley, Banzhaf, and leave-one-out values. Experiments on existing benchmarks evaluate marginal-contribution fidelity, score-release cost, corrupted-sample detection, ablations, and failure regimes. NDDV is a one-run, trajectory-conditioned estimator, not an unconditional replacement for cooperative-game values.
Neural Dynamic Data Valuation via Stochastic State-Adjoint Trajectories
Classical data valuation defines a data point's value through the finite marginal contribution $U(C\cup\{i\})-U(C)$, but estimating this quantity over coalitions requires repeated training and does not describe the contribution made along a stochastic training path.
- Year
- 2024
- Hosting
- Abstract onlyARXIV-DEFAULT
Cite
Notes
Only stored in your browser.
Attribution
- Abstract & full text
- arxiv.org/abs/2404.19557ARXIV-DEFAULT
- TL;DR
- Semantic Scholar