0

On Rate-Optimal Partitioning Classification from Observable and from Privatised Data

In this paper we revisit the classical method of partitioning classification and prove novel convergence rates under relaxed conditions, both for observable (non-privatised) and for privatised data. We consider the problem of classification in a $d$ dimensional Euclidean space.

Year
2023
Hosting
Abstract onlyARXIV-DEFAULT

Cite

Notes

Only stored in your browser.

Attribution

Abstract & full text
arxiv.org/abs/2312.14889ARXIV-DEFAULT
TL;DR
Semantic Scholar
Attribution policy →

Abstract

In this paper we revisit the classical method of partitioning classification and prove novel convergence rates under relaxed conditions, both for observable (non-privatised) and for privatised data. We consider the problem of classification in a d dimensional Euclidean space. Previous results on the partitioning classifier worked with the strong density assumption (SDA), which is restrictive, as we demonstrate through simple examples. Here, we study the problem under much milder assumptions. We presuppose that the distribution of the inputs is a mixture of an absolutely continuous and a discrete distribution, such that the absolutely continuous component is concentrated on a d_a dimensional subspace. In addition to the standard Lipschitz and margin conditions, a novel characteristic of the absolutely continuous component is introduced, by which the convergence rate of the classification error probability is computed, both for the binary and for the multi-class cases. This bound can reach the minimax optimal convergence rate achievable using SDA, but under much milder distributional assumptions. Interestingly, this convergence rate depends only on the intrinsic dimension of the continuous inputs, d_a, and not on d. Under privacy constraints, the data cannot be directly observed, and the constructed classifiers are functions of the randomised outcome of a suitable local differential privacy mechanism. In this paper we add Laplace distributed noises to the discretisations of all possible locations of the feature vector and to its label. Again, tight upper bounds on the convergence rate of the classification error probability can be derived, without using SDA, such that this rate depends on 2d_a.