High-dimensional compositional data pose unique statistical challenges due to the simplex constraint and excess zeros. While dimension reduction is indispensable for analyzing such data, conventional approaches often rely on log-ratio transformations that compromise interpretability and distort the data through ad hoc zero replacements. To address these issues, we introduce a geometry-preserving framework for dimension reduction of compositional data, mapping high-dimensional compositions directly to a lower-dimensional simplex. This framework is interpretable as a softened amalgamation of compositions and enables dual visualization -- showing both projected data and how variables contribute to reduced components -- for at-a-glance interpretation. Within this geometry, we define a new sufficient dimension reduction (SDR) approach for compositional predictors, whose identifiable object, termed the central compositional subspace, differs from the classical central subspace in Euclidean SDR. For estimation, we propose a kernel-based method that yields sparse solutions and comes with an intrinsic predictive model for direct downstream analyses. We prove consistency through a new subspace-comparison argument that allows the estimated and target subspaces to have different dimensions. Applications to real microbiome datasets demonstrate that our approach provides a powerful graphical exploration tool for uncovering meaningful biological patterns in high-dimensional compositional data.
Geometry-preserving and interpretable dimension reduction for compositional data
High-dimensional compositional data pose unique statistical challenges due to the simplex constraint and excess zeros. While dimension reduction is indispensable for analyzing such data, conventional approaches often rely on log-ratio transformations that compromise…
- Preview

- Year
- 2025
- Hosting
- Abstract onlyARXIV-DEFAULT
Cite
Notes
Only stored in your browser.
Attribution
- Abstract & full text
- arxiv.org/abs/2509.05563ARXIV-DEFAULT
- TL;DR
- Semantic Scholar