A random forest guided tour

Published 2015 in Test (Madrid)

ABSTRACT

The random forest algorithm, proposed by L. Breiman in 2001, has been extremely successful as a general-purpose classification and regression method. The approach, which combines several randomized decision trees and aggregates their predictions by averaging, has shown excellent performance in settings where the number of variables is much larger than the number of observations. Moreover, it is versatile enough to be applied to large-scale problems, is easily adapted to various ad hoc learning tasks, and returns measures of variable importance. The present article reviews the most recent theoretical and methodological developments for random forests. Emphasis is placed on the mathematical forces driving the algorithm, with special attention given to the selection of parameters, the resampling mechanism, and variable importance measures. This review is intended to provide non-experts easy access to the main ideas.

PUBLICATION RECORD

Publication year
2015
Venue
Test (Madrid)
Publication date
2015-11-18
Fields of study
Mathematics, Computer Science
Identifiers
DOI 10.1007/s11749-016-0481-7 arXiv 1511.05741
External record
Open on Semantic Scholar
Source metadata
Semantic Scholar

CITATION MAP

EXTRACTION MAP

CLAIMS

No claims are published for this paper.

CONCEPTS

No concepts are published for this paper.

REFERENCES

Random Forests and Kernel Methods
2015influential reference
The effect of splitting on random forests
2015influential reference
Quantifying Uncertainty in Random Forests via Confidence Intervals and Hypothesis Tests
2014cited by this paper
Casting Random Forests as Artificial Neural Networks (and Profiting from It)
2014cited by this paper
A Novel Test for Additivity in Supervised Ensemble Learners
2014cited by this paper
Formal Hypothesis Tests for Additive Structure in Random Forests
2014cited by this paper
Approximate False Positive Rate Control in Selection Frequency for Random Forest
2014cited by this paper
Gene Selection for Cancer Classification using Support Vector Machines
2014cited by this paper
Probability estimation with machine learning methods for dichotomous and multicategory outcome: Applications
2014cited by this paper
Non-uniform feature sampling for decision tree ensembles
2014cited by this paper
Mondrian Forests: Efficient Online Random Forests
2014cited by this paper
Asymptotic Theory for Random Forests
2014cited by this paper
Probability estimation with machine learning methods for dichotomous and multicategory outcome: Theory
2014cited by this paper
On the asymptotics of random forests
2014influential reference
Ensemble Trees and CLTs: Statistical Inference for Supervised Learning
2014cited by this paper
Consistency of Random Forests
2014cited by this paper
Mining data with random forests: current options for real‐world applications
2014cited by this paper
Random Forests with Random Projections of the Output Space for High Dimensional Multi-label Classification
2014cited by this paper
Analysis of purely random forests bias
2014influential reference
The Random Forest Kernel and other kernels for big data from random partitions
2014cited by this paper
Big Data: New Tricks for Econometrics
2014cited by this paper
Narrowing the Gap: Random Forests In Theory and In Practice
2013cited by this paper
Ranking forests
2013cited by this paper
Applied Predictive Modeling
2013cited by this paper
One class random forests
2013cited by this paper
Correlation and variable importance in random forests
2013cited by this paper
Confidence intervals for random forests: the jackknife and the infinitesimal jackknife
2013cited by this paper
A weighted random forests approach to improve predictive performance
2013cited by this paper
Understanding variable importances in forests of randomized trees
2013influential reference
Spatially Adaptive Random Forests
2013cited by this paper
Cellular Tree Classifiers
2013cited by this paper
Standard Errors for Bagged Predictors and Random Forests
2013cited by this paper
Consumer credit risk: Individual probability estimates using machine learning
2013influential reference
Consistency of Online Random Forests
2013cited by this paper
Variance reduction in purely random forests
2012influential reference
Feature selection via regularized trees
2012influential reference
Decision Forests: A Unified Framework for Classification, Regression, Density Estimation, Manifold Learning and Semi-Supervised Learning
2012cited by this paper
Information Forests
2012cited by this paper
Dynamic Random Forests
2012cited by this paper
Classification and regression trees
2012cited by this paper
Overview of random forest methodology and practical guidance with emphasis on computational biology and bioinformatics
2012cited by this paper
Gene selection with guided regularized random forest
2012cited by this paper
Imputation of missing values for semi-supervised data using the proximity in random forests
2012influential reference
Empirical comparison of tree ensemble variable importance measures
2011cited by this paper
Cluster Forests
2011cited by this paper
Minimax hypothesis testing for curve registration
2011cited by this paper
A scalable bootstrap for massive data
2011cited by this paper
On Oblique Random Forests
2011cited by this paper
Random survival forests for high‐dimensional data
2011cited by this paper
Real-time human pose recognition in parts from single depth images
2011cited by this paper
Probability Machines
2011cited by this paper
Random Forests with Missing Values in the Covariates
2010cited by this paper
Kernel induced random survival forests
2010cited by this paper
On safari to Random Jungle: a fast implementation of Random Forests for high-dimensional data
2010cited by this paper
On the Rate of Convergence of the Bagged Nearest Neighbor Estimate
2010cited by this paper
Analysis of a Random Forests Model
2010influential reference
Spatiotemporal exploratory models for broad-scale survey data.
2010cited by this paper
On the layered nearest neighbour estimate, the bagged nearest neighbour estimate and the random forest method in regression and classification
2010cited by this paper
On-line Random Forests
2009cited by this paper
Predictor correlation impacts machine learning algorithms: implications for genomic studies
2009cited by this paper
Tree-Based Ranking Methods
2009cited by this paper
Forest Garrote
2009cited by this paper
Fast growing and interpretable oblique trees via logistic regression models
2009cited by this paper
The Elements of Statistical Learning, Second Edition, Trevor Hastie, Robert Tishirani, Jerome Friedman. (2009), Springer Series in Statistics, ISBN 0172-7397, 745 pp
2009cited by this paper
Empirical characterization of random forest variable importance measures
2008influential reference
Consistency of Random Forests and Other Averaging Classifiers
2008cited by this paper
Enriched random forests
2008cited by this paper
yaImpute: An R Package for kNN Imputation
2008influential reference
Forest-RK: A New Random Forest Induction Method
2008influential reference
Random survival forests
2008cited by this paper
BMC Bioinformatics BioMed Central Methodology article Conditional variable importance for random forests
2008cited by this paper
MapReduce: simplified data processing on large clusters
2008cited by this paper
Variable importance in binary regression trees and forests
2007influential reference
Learning Deep Architectures for AI
2007influential reference
Confidence sets for split points in decision trees
2007cited by this paper
Classification and Regression by randomForest
2007cited by this paper
Extremely randomized trees
2006cited by this paper
Newer Classification and Regression Tree Techniques: Bagging and Random Forests for Ecological Prediction
2006cited by this paper
Quantile Regression Forests
2006cited by this paper
Unbiased Recursive Partitioning: A Conditional Inference Framework
2006cited by this paper
Variable Selection Using Random Forests
2006cited by this paper
Random Forests and Adaptive Nearest Neighbors
2006cited by this paper
EGFR Activation and Ultraviolet Light-Induced Skin Carcinogenesis
2006cited by this paper
Maxima in hypercubes
2005cited by this paper
Multiple Classifier Systems
2004cited by this paper
Using Random Forest to Learn Imbalanced Data
2004influential reference
CONSISTENCY FOR A SIMPLE MODEL OF RANDOM FORESTS
2004influential reference
The Elements of Statistical Learning
2003cited by this paper
Two statistical methods for the detection of environmental thresholds
2003cited by this paper
Random Forest: A Classification and Regression Tool for Compound Classification and QSAR Modeling
2003cited by this paper
A Distribution-Free Theory of Nonparametric Regression
2002cited by this paper
Gene Selection for Cancer Classification using Support Vector Machines
2002cited by this paper
Random Forests
2001influential reference
Subsampling
2001cited by this paper
PERT – Perfect Random Tree Ensembles
2001cited by this paper
Limiting the Number of Trees in Random Forests
2001cited by this paper
The Elements of Statistical Learning: Data Mining, Inference, and Prediction
2001cited by this paper
Analyzing Bagging
2001cited by this paper
SOME INFINITY THEORY FOR PREDICTOR ENSEMBLES
2000cited by this paper
Multiple Classifier Systems
2000cited by this paper

CITED BY

Predictive modeling of lithium battery capacity loss using electrolyte-cathode parameters and machine learning approaches
2026cites this paper
Introduction to machine learning in undergraduate physics
2026cites this paper
Using machine learning to predict the groundwater quality index in Morocco’s Saïss shallow aquifer
2026cites this paper
A comparative empirical analysis for robust thyroid disorder detection using machine learning techniques
2026cites this paper
A Framework for Digital Technologies Application in Soft Sensing Monitoring to Recombinant Nanobody Production in Escherichia coli: From Model Explanation to the Study of Potential Scenario.
2026cites this paper
Automated assessment of technological and financial drivers of greenhouse gas reduction in sustainable renewable energy systems.
2026cites this paper
Thermo-mechanical co-design of 2.5D flip-chip packages with silicon and glass interposers via finite element analysis and machine learning
2026cites this paper
Australian Bushfire Intelligence with AI-Driven Environmental Analytics
2026cites this paper
Estimation of Daily Reference Evapotranspiration Using Machine Learning and Deep Learning Techniques with Sparse Meteorological Data
2026cites this paper
Integration of artificial intelligence techniques with infrared thermography for defect detection in concrete structures: A systematic review
2026cites this paper
Scale-dependent lag effects and interpretable machine learning: Advancing rubber yield prediction
2026cites this paper
Machine learning models for identifying urinary incontinence in women with a history of hysterectomy using basic demographic and clinical characteristics: A cross-sectional study
2026cites this paper
Random Forests as Statistical Procedures: Design, Variance, and Dependence
2026cites this paper
Supply-side dominance in China’s inflation dynamics: a machine-learning approach
2026cites this paper
A decision-making model for rooftop PV retrofit in urban building stocks under climate change
2026cites this paper
Progress in machine learning applications for underground hydrogen storage: A review
2026cites this paper
Prediction of monthly precipitation and maximum 24 h precipitation using Random Forest, Decision Tree and XGBoost models
2026cites this paper
Deep-learning-based prediction of mutant formation pore pressure: A case study from the Xihu Sag in the East China sea
2026cites this paper
Radar Signal Classification with Quantum Machine Learning: Ansatz Depth Impact on Expressibility
2026cites this paper
A Novel Hybrid Machine Learning Approach for Biorefinery Products in Pesticide-Rich Wastewater
2026cites this paper
Asphalt Quality Evaluation and Intelligent Classification via ATR-FTIR Functional Group Analysis
2026cites this paper
Simulating Daily Evapotranspiration of Summer Soybean in the North China Plain Using Four Machine Learning Models
2026cites this paper
Green AI driven quantile boosting ensembles for sustainable real estate valuation
2026cites this paper
Design of Double-Lattice Photonic Crystal of DUV Laser by ANN-RBF Neural Network
2026cites this paper
Steel defect detection based on feature selection and ensemble technique
2026cites this paper
Intelligent Prediction and Decision Model of Blast Furnace Temperature Based on Machine Learning and Process Control
2026cites this paper
Aggregate Models, Not Explanations: Improving Feature Importance Estimation
2026cites this paper
Patterns in Minnesota’s lake user perception, trophic state, and lake assessment datasets: a 35-year retrospective
2026cites this paper
Long-term landscape change drives shifts in functional traits of wintering waterbirds along China's Yellow Sea Coast.
2026cites this paper
Microbial community biomarkers can forecast methane production in full-scale anaerobic digesters.
2026cites this paper
Prediction of Half-Value Layer of glass-based materials using machine learning algorithms
2026cites this paper
A Multi-Objective Framework for Power-Aware Scheduling in Kubernetes
2026cites this paper
Machine learning for non-destructive nutrient diagnosis in citrus: comparing spectral analysis and hyperspectral imaging with CNNs
2026cites this paper
Indoor 1D-Localization of Omnidirectional Power-Modulated Jammers: A Machine Learning Approach With Rapid Database Generation
2026cites this paper
Dynamic Prediction of Reducing Sugar Content in Daqu Based on a Time Series–Microenvironment Coupled Stacking Model
2026cites this paper
Quantifying the impact of just-in-time (JIT) systems on freight rates
2026cites this paper
Prediction of Cellular Malignancy Using Electrical Impedance Signatures and Supervised Machine Learning
2026cites this paper
Equilibrium scour depth evaluation using well-established regression models in non-cohesive sediment desilting through hydro-suction
2026cites this paper
Improving the multi-temporal dissolved oxygen estimation using statistically enhanced remote sensing proxies in data-scarce, optically complex inland water
2026cites this paper
AI-Based Prediction of Rock Mechanical Properties Using XRD, XRF, and Gamma Ray Data
2026cites this paper
Machine Learning Prediction and Interpretability Analysis of Coal and Gas Outbursts
2026cites this paper
A novel quantitative approach for factor identification and risk prediction of cadmium accumulation in wheat using machine learning and Bayesian models
2026cites this paper
Comparison of VADER and TextBlob labeling for sentiment analysis using machine learning and deep learning models: A study on generative AI user experience.
2026cites this paper
Accurate intelligent modeling of mud loss while drilling wells via soft computing methods
2026cites this paper
AI-Enhanced smart sensors for heavy metal detection in water treatment
2026cites this paper
Multi-Crop Yield Estimation and Spatial Analysis of Agro-Climatic Indices Based on High-Resolution Climate Simulations in Türkiye’s Lakes Region, a Typical Mediterranean Biogeography
2026cites this paper
Machine Learning Based Fault Diagnosis in Variable Speed Synchronous Generators
2026cites this paper
A Survey of AI-Enabled Predictive Maintenance for Railway Infrastructure: Models, Data Sources, and Research Challenges
2026cites this paper
Exploring the Impact of Different Clustering Algorithms on the Performance of Ensemble Learning-Based Mass Appraisal Models
2026cites this paper
Feature importance guided autoencoder for dimensionality reduction in intrusion detection systems
2026cites this paper
A data-driven method for predicting short-term electricity demand using technical indicators
2026cites this paper
Harmonizing patient-reported outcome measures for nasal complaints using traditional and machine learning methods
2026cites this paper
An intelligent VMD-WTC-GRU hybrid framework with uncertainty quantification for forecasting extreme flood events in semi-arid regions
2026cites this paper
Tectonics as a Regulator of Shoreline Retreat and Rocky Coast Evolution Across Timescales
2026cites this paper
Predicting the Compressive Strength of Ash‐Based Concrete Using Machine Learning Approach: Paving the Way to Sustainable Concrete
2026cites this paper
Predicting Energy Expenditure in Preschool Children Using Accelerometer and Gyroscope Data.
2026cites this paper
Machine learning in predicting failures of buried water supply networks affected by mining impacts.
2026cites this paper
Estimating Daily Wet Nitrate Deposition Across China Considering Under-Cloud and In-Cloud Scavenging Effects From Satellite Observations
2026cites this paper
Key Factors in the Sustainable Growth of MSMEs in Ibero-America: An Empirical Study Based on Machine Learning
2026cites this paper
Artificial intelligence for assessing player partnerships in football matches
2026cites this paper
Enhancing online well-being through transformer-based analysis of misinformation and mental health.
2026cites this paper
Integrating GIS and machine learning for automated starch grain morphometric analysis: a novel framework for standardized archaeobotanical classification
2026cites this paper
Flow anomaly detection in harsh industrial environments: A data analytics & machine learning approach
2026cites this paper
Local buckling strength prediction of slotted cold-formed steel beams using ensemble learning
2026cites this paper
Trade-offs between fairness and performance in educational AI: Analyzing post-processing bias mitigation on the OULAD
2026cites this paper
Machine learning approaches for creep rupture life prediction of metallic materials: A comprehensive review
2026cites this paper
Facilitating Wise Decision-Making for Bounty Backers in Open Source Software Communities
2026cites this paper
Applying multimodal learning analytics to naturalistic recordings of clinical simulations: Towards an accurate and scalable pipeline for automated feedback generation
2026cites this paper
A novel spatial downscaling algorithm based on deep learning considering geographical spatial heterogeneity and nonlinear changes: a case study of the Yangtze River Basin
2026cites this paper
Reinforced Dual-Flow Neural Network for Tabular Data Classification With Dynamical Transformer and Fuzzy Clustering
2026cites this paper
YOLO-SAD: Enhancing Small Arthropod Target Detection for Autonomous Port Inspection Robots
2026cites this paper
Regression analysis of heat transfer in twin slot jet impingement with computational fluid dynamics and machine learning techniques
2026cites this paper
Runoff Reconstructions and Future Projections Indicate Highly Variable Water Supply From Pacific Rim Water Towers
2026cites this paper
A Survey of Six Classical Classifiers, Including Algorithms, Methodological Characteristics, Foundational Variants, and Recent Advances
2026cites this paper
A metamodeling based simulation approach to investigate ambulance multi-period redeployment in emergency medical services
2026cites this paper
From injury to comeback: A systematic review of machine learning models predicting return to sport in athletes
2026influential citation
Decoding the future of agricultural participation: machine learning insights to unravel the plausible triggers
2026cites this paper
Interpreting and forecasting crop-specific irrigation water productivity in an arid irrigated area using explainable machine learning and scenario simulation
2026cites this paper
Comprehensive Benefits Evaluation of the Impact of Vertical City on Solar PV Utilization for Achieving Smart Sustainable Cities
2026cites this paper
Development of evolutionarily optimized random forest models to accurately estimate coke strength after reaction
2026cites this paper
Predicting postoperative complications in laparoscopic general surgery using machine and deep learning: a classification approach
2026cites this paper
Researcher positions and the emergence of interdisciplinary scientific fields – The case of synthetic biology
2026cites this paper
A machine learning study of the predictors of fear of happiness in Turkey and the USA.
2026cites this paper
DPCDI: an artificial intelligent-derived indicator interpreting the diagnostic, stratification, and therapeutic implications of druggability programmed cell death in heart failure
2026cites this paper
A survey of features used for representing black-box single-objective continuous optimization
2026cites this paper
Portable Electronic Olfactometer for Non-Invasive Screening of Canine Ehrlichiosis: A Proof-of-Concept Study Using Machine Learning
2026cites this paper
Compressive strength prediction of carbonated recycled aggregate concrete using regression based machine learning models
2026cites this paper
Interpretable machine learning framework for air quality prediction in Istanbul using Shapley additive explanations (SHAP)
2026cites this paper
Factor of safety prediction for high road embankments using mixed effects random forest and bee colony optimization
2026cites this paper
Hybrid Physical Segmentation and Machine Learning Approach for Deep Convective Cloud Detection With Himawari-8
2026cites this paper
Study on methods for measuring beef color and predicting storage time based on computer vision.
2026cites this paper
Protection for sale without aggregation bias
2026cites this paper
Determining Material Removal and Electrode Wear in Electric Discharge Machining with a Generalist Machine Learning Framework
2026cites this paper
SKALE: An Interpretable Multiscale Machine Learning Model for Decoding Phase‐Specific Protein Aggregation in Neurodegenerative Proteinopathies
2026cites this paper
Predicting the Shear Strength of RC Deep Beams with Wide Openings Using FEM and Machine Learning-Based Ni-Ti SMA Retrofitting
2026cites this paper
Machine Learning for Water Quality Prediction and Uncertainty Assessment
2026cites this paper
Machine Learning-Based Feature Selection Analysis of Academic Spin-Off Survival in Technoparks Located in Türkiye
2026cites this paper
Experimental and ML-assisted optimization of injection timing and EGR in a diesel engine fueled with palmyra biodiesel
2026cites this paper
A Systematic Review of Machine Learning Techniques for Predicting Compressive and Flexural Strength of Mortars
2026cites this paper
Agriculture-driven land transformation: Predicting future land use changes in Makoni District, Zimbabwe using landsat data and cellular automata
2026cites this paper