0

Exact and Approximate Range Queries for Efficient Ball Mapper Construction

Ball Mapper is a tool in topological data analysis that summarizes a finite metric dataset by covering it with metric balls and encoding their overlaps as a graph. Its construction requires repeated fixed-radius range queries, which can become computationally expensive for large…

Preview
Year
2026
Hosting
Full text hostedCC-BY-4.0

Cite

Notes

Only stored in your browser.

Attribution

Abstract & full text
arxiv.org/abs/2601.01405CC-BY-4.0
TL;DR
Semantic Scholar
Attribution policy →

Abstract

Ball Mapper is a tool in topological data analysis that summarizes a finite metric dataset by covering it with metric balls and encoding their overlaps as a graph. Its construction requires repeated fixed-radius range queries, which can become computationally expensive for large or high-dimensional datasets. This work studies two approaches to accelerating this step: ball tree data structures, which use metric-space pruning, and the FAISS library, which uses optimized similarity-search routines for dense vectors. We distinguish between exact acceleration, where the range sets are preserved, and approximate search, where ball memberships may change. For approximate range queries, we formulate deterministic additive and multiplicative error models and show how these errors affect the covering radius, landmark separation, and graph structure of Ball Mapper. We then evaluate several FAISS index configurations on synthetic datasets with different geometries. The experiments show that the tested approximate indexes behave conservatively. They remove ball memberships and graph edges but do not introduce false-positive memberships or spurious edges. The severity of these effects depends strongly on dataset geometry, with the isotropic Gaussian dataset being more sensitive than clustered or low-dimensional structured data.