We investigate the computational complexity of the task of detecting dense regions of an unknown distribution from unlabeled samples of this distribution. We introduce a formal learning model for this task that uses a hypothesis class as it “anti-overfitting” mechanism. The learning task in our model can be reduced to a combinatorial optimization problem. We can show that for some constants, depending on the hypothesis class, these problems are NP-hard to approximate to within these constant factors. We go on and introduce a new criterion for the success of approximate optimization geometric problems. The new criterion requires that the algorithm competes with hypotheses only on the points that are separated by some margin ? from their boundaries. Quite surprisingly, we discover that for each of the two hypothesis classes that we investigate, there is a “critical value” of the margin parameter ?. For any value below the critical value the problems are NP-hard to approximate, while, once this value is exceeded, the problems become poly-time solvable.
The Computational Complexity of Densest Region Detection
Shai Ben-David,Nadav Eiron,H. Simon
Published 2000 in Journal of computer and system sciences (Print)
ABSTRACT
PUBLICATION RECORD
- Publication year
2000
- Venue
Journal of computer and system sciences (Print)
- Publication date
2000-06-28
- Fields of study
Mathematics, Computer Science
- Identifiers
- External record
- Source metadata
Semantic Scholar
CITATION MAP
EXTRACTION MAP
CLAIMS
- No claims are published for this paper.
CONCEPTS
- No concepts are published for this paper.
REFERENCES
Showing 1-17 of 17 references · Page 1 of 1
CITED BY
Showing 1-24 of 24 citing papers · Page 1 of 1