We give two provably accurate feature-selection techniques for the linear SVM. The algorithms run in deterministic and randomized time respectively. Our algorithms can be used in an unsupervised or supervised setting. The supervised approach is based on sampling features from support vectors. We prove that the margin in the feature space is preserved to within ź-relative error of the margin in the full feature space in the worst-case. In the unsupervised setting, we also provide worst-case guarantees of the radius of the minimum enclosing ball, thereby ensuring comparable generalization as in the full feature space and resolving an open problem posed in Dasgupta et al. (2007) 7. We present extensive experiments on real-world datasets to support our theory and to demonstrate that our method is competitive and often better than prior state-of-the-art, for which there are no known provable guarantees. HighlightsWe give two provably accurate feature-selection techniques for the linear SVM.Algorithms can be used in supervised or unsupervised setting.We prove margin is preserved to within e-relative error in the full feature space.In unsupervised case, we provide worst-case guarantees of margin and radius of minimum enclosing ball.Extensive experiments demonstrate that our method is competitive and often better than prior art.
Feature selection for linear SVM with provable guarantees
Saurabh Paul,M. Magdon-Ismail,P. Drineas
Published 2014 in Pattern Recognition
ABSTRACT
PUBLICATION RECORD
- Publication year
2014
- Venue
Pattern Recognition
- Publication date
2014-06-01
- Fields of study
Mathematics, Computer Science
- Identifiers
- External record
- Source metadata
Semantic Scholar
CITATION MAP
EXTRACTION MAP
CLAIMS
- No claims are published for this paper.
CONCEPTS
- No concepts are published for this paper.
REFERENCES
Showing 1-37 of 37 references · Page 1 of 1
CITED BY
Showing 1-61 of 61 citing papers · Page 1 of 1