How Reliable are Fairness Audits with Unreliable Data?

Fairness audits are a key component of responsible machine-learning deployment. Yet, the reliability of audit recommendations under incomplete protected-label access is still poorly understood. In this work, we focused on protected-label missingness in fairness mitigation audits. We introduced a seed-calibrated stress test to separate missingness effects from seed-to-seed movement that is already present under complete labels. Across ACS/Folktables tasks, we found that positive-availability missingness usually does not move selected mitigation methods beyond the complete-label seed floor. The no-label endpoint behaves differently, exposing ERM-equivalent candidates and deterministic tie-breaking rather than a broad missingness effect. We also found that threshold optimization can turn single-axis fairness gains into above-null intersectional harm, a sharper failure pattern that appears to remain visible under random-forest validation. Overall, our results highlight that protected-label missingness should be reported with seed-null calibration, candidate-set context, and intersectional consequences before it is treated as evidence of audit fragility.