Let P be a set of n points in R d . In the projective clustering problem, given k;q and norm 2 [1;1], we have to compute a setF ofk q-dimensional ats such that ( P p2P d(p;F) ) 1= is minimized; here d(p;F) represents the (Euclidean) distance of p to the closest at in F. We let f q k (P; ) denote the minimal value and interpret f q k (P;1) to be maxr2P d(r;F). When = 1; 2 and1 and q = 0, the problem corresponds to the k-median, kmean and the k-center clustering problems respectively. For every 0 < " < 1, S P and 1, we show that the orthogonal projection of P onto a randomly chosen at of dimension O(((q + 1) 2 log(1=")=" 3 ) logn) will "approximate f q 1 (S; ). This result combines the concepts of geometric coresets and subspace embeddings based on the Johnson-Lindenstrauss Lemma. As a consequence, an orthogonal projection of P to an O(((q + 1) 2 log((q + 1)=")=" 3 ) logn) dimensional randomly chosen subspace "-approximates projective clusterings for every k and simultaneously. Note that the dimension of this subspace is independent of the number of clusters k. Using this dimension reduction result, we obtain new approximation and streaming algorithms for projective clustering problems. For example, given a stream of n points, we show how to compute an "-approximate projective clustering for every k and simultaneously using only O((n + d)((q + 1) 2 log((q + 1)="))=" 3 logn) space. Compared to standard streaming algorithms with ( kd) space requirement, our approach is a signicant improvement when the number of input points and their dimensions are of the same order of magnitude.
Approximation and Streaming Algorithms for Projective Clustering via Random Projections
Published 2014 in Canadian Conference on Computational Geometry
ABSTRACT
PUBLICATION RECORD
- Publication year
2014
- Venue
Canadian Conference on Computational Geometry
- Publication date
2014-07-08
- Fields of study
Mathematics, Computer Science
- Identifiers
- External record
- Source metadata
Semantic Scholar
CITATION MAP
EXTRACTION MAP
CLAIMS
- No claims are published for this paper.
CONCEPTS
- No concepts are published for this paper.
REFERENCES
Showing 1-43 of 43 references · Page 1 of 1
CITED BY
Showing 1-28 of 28 citing papers · Page 1 of 1