Fairness audits are a key component of responsible machine-learning deployment. Yet, the reliability of audit recommendations under incomplete protected-label access is still poorly understood. In this work, we focused on protected-label missingness in fairness mitigation audits. We introduced a seed-calibrated stress test to separate missingness effects from seed-to-seed movement that is already present under complete labels. Across ACS/Folktables tasks, we found that positive-availability missingness usually does not move selected mitigation methods beyond the complete-label seed floor. The no-label endpoint behaves differently, exposing ERM-equivalent candidates and deterministic tie-breaking rather than a broad missingness effect. We also found that threshold optimization can turn single-axis fairness gains into above-null intersectional harm, a sharper failure pattern that appears to remain visible under random-forest validation. Overall, our results highlight that protected-label missingness should be reported with seed-null calibration, candidate-set context, and intersectional consequences before it is treated as evidence of audit fragility.
How Reliable are Fairness Audits with Unreliable Data?
Fairness audits are a key component of responsible machine-learning deployment. Yet, the reliability of audit recommendations under incomplete protected-label access is still poorly understood.
- Preview

- Year
- 2025
- Hosting
- Full text hostedCC-BY-4.0
Cite
Notes
Only stored in your browser.
Attribution
- Abstract & full text
- arxiv.org/abs/2506.23033CC-BY-4.0
- TL;DR
- Semantic Scholar