Optimally Pruning Decision Tree Ensembles With Feature Cost

Feng Nan,Joseph Wang,Venkatesh Saligrama

Published 2016 in arXiv.org

ABSTRACT

We consider the problem of learning decision rules for prediction with feature budget constraint. In particular, we are interested in pruning an ensemble of decision trees to reduce expected feature cost while maintaining high prediction accuracy for any test example. We propose a novel 0-1 integer program formulation for ensemble pruning. Our pruning formulation is general - it takes any ensemble of decision trees as input. By explicitly accounting for feature-sharing across trees together with accuracy/cost trade-off, our method is able to significantly reduce feature cost by pruning subtrees that introduce more loss in terms of feature cost than benefit in terms of prediction accuracy gain. Theoretically, we prove that a linear programming relaxation produces the exact solution of the original integer program. This allows us to use efficient convex optimization tools to obtain an optimally pruned ensemble for any given budget. Empirically, we see that our pruning algorithm significantly improves the performance of the state of the art ensemble method BudgetRF.

PUBLICATION RECORD

Publication year
2016
Venue
arXiv.org
Publication date
2016-01-05
Fields of study
Mathematics, Computer Science
Identifiers
arXiv 1601.00955
External record
Open on Semantic Scholar
Source metadata
Semantic Scholar

CITATION MAP

EXTRACTION MAP

CLAIMS

No claims are published for this paper.

CONCEPTS

No concepts are published for this paper.

REFERENCES

Points of Significance: Classification and regression trees
2017cited by this paper
Feature-Budgeted Random Forest
2015cited by this paper
Efficient Learning by Directed Acyclic Graph For Resource Constrained Prediction
2015cited by this paper
An LP for Sequential Learning Under Budgets
2014cited by this paper
Fast margin-based cost-sensitive classification
2014cited by this paper
Model Selection by Linear Programming
2014cited by this paper
Integer and Combinatorial Optimization
2013cited by this paper
Supervised Sequential Classification Under Budget Constraints
2013cited by this paper
The Greedy Miser: Learning under Test-time Budgets
2012cited by this paper
Pruning of Random Forest classifiers: A survey and future directions
2012cited by this paper
Proceedings of the Yahoo! Learning to Rank Challenge, held at ICML 2010, Haifa, Israel, June 25, 2010
2011cited by this paper
Active Classification based on Value of Classifier
2011cited by this paper
Learning Multiple Layers of Features from Tiny Images
2009cited by this paper
An Optimal Constrained Pruning Strategy for Decision Trees
2009influential reference
A Dynamic Programming Based Pruning Method for Decision Trees
2001cited by this paper

CITED BY

3FS-CBR-IRF: improving case retrieval for case-based reasoning with three feature selection and improved random forest
2024cites this paper