Minimax Optimal Algorithms with Fixed-$k$-Nearest Neighbors

This paper presents how to perform minimax optimal classification, regression, and density estimation based on fixed-k nearest neighbor (NN) searches. We consider a distributed learning scenario, in which a massive dataset is split into smaller groups, where the k-NNs are found for a query point with respect to each subset of data. We propose optimal rules to aggregate the fixed-k-NN information for classification, regression, and density estimation that achieve minimax optimal rates for the respective problems. We show that the distributed algorithm with a fixed k over a sufficiently large number of groups attains a minimax optimal error rate up to a multiplicative logarithmic factor under some regularity conditions. Roughly speaking, distributed k-NN rules with M groups has a performance comparable to the standard \Theta(kM)-NN rules even for fixed k.

Minimax Optimal Algorithms with Fixed-$k$-Nearest Neighbors

Abstract

Authors