0

Trustworthy AI Suffers from Invariance Conflicts and Causality is The Solution

As artificial intelligence (AI), including machine learning (ML) models and foundation models (FMs), are increasingly deployed in high-stakes domains, ensuring their trustworthiness has become a central challenge.

Preview
Year
2026
Hosting
Full text hostedCC-BY-4.0

Cite

Notes

Only stored in your browser.

Attribution

Abstract & full text
arxiv.org/abs/2605.02640CC-BY-4.0
TL;DR
Semantic Scholar
Attribution policy →

Abstract

As artificial intelligence (AI), including machine learning (ML) models and foundation models (FMs), are increasingly deployed in high-stakes domains, ensuring their trustworthiness has become a central challenge. However, the core trustworthy AI objectives, such as fairness, robustness, privacy, and explainability, are hard to achieve simultaneously, especially while preserving utility. This position paper argues that causality is necessary to understand and balance trade-offs in performance and multiple objectives of trustworthy AI. We ground our arguments in re-interpreting trustworthy AI trade-offs as incompatible invariance requirements under different changes to the data-generating process. We then illustrate this argument through case-study analyses from the literature and a stylized synthetic-data simulation, showing that causality provides a unifying framework for understanding how trade-offs in trustworthy AI arise and how they can be softened or resolved through selective invariance. This perspective applies to both classical ML models and large-scale FMs. Finally, we outline open challenges and opportunities for using causality to build both trustworthy and high-performing AI.