Massively parallel and streaming algorithms for balanced clustering

Kian Mirjalali,H. Zarrabi-Zadeh

Published 2023 in Theoretical Computer Science

ABSTRACT

Clustering is a fundamental problem with a wide range of applications. In this paper, we study a balanced version of the well-known k -center clustering problem, where the objective is to select k centers from a given set of points, and assign to each center at most L of the input points, so as to minimize the maximum distance from a point to the center to which it is assigned. We assume soft capacities, where points are allowed to be selected as center more than once. Motivated by applications involving massive datasets, we study the problem in the massively parallel computation (MPC) and streaming models. In particular, we present a two-round MPC algorithm for the balanced k -center problem, achieving an approximation factor of 5 (cid:11) + 2, where (cid:11) is the approximation factor of the corresponding sequential algorithm. This substantially improves the currently best approximation factor of 32 (cid:11) , available for the problem. We show that the approximation factor of our algorithm can be further improved to 5 + " in all spaces with bounded doubling dimension, including the Euclidean space. We also consider the balanced k -center problem in the streaming model, and present a constant-factor streaming algorithm in any metric space using O ( kn " ) memory, and a factor 5 + " streaming algorithm using O ( k polylog n ) memory in doubling metrics.

PUBLICATION RECORD

CITATION MAP

EXTRACTION MAP

CLAIMS

  • No claims are published for this paper.

CONCEPTS

  • No concepts are published for this paper.

REFERENCES

Showing 1-26 of 26 references · Page 1 of 1

CITED BY

  • No citing papers are available for this paper.

Showing 0-0 of 0 citing papers · Page 1 of 1