Random forest for gene selection and microarray data classification

Published 2011 in Knowledge Technology Week

ABSTRACT

A random forest method has been selected to perform both gene selection and classification of the microarray data. In this embedded method, the selection of smallest possible sets of genes with lowest error rates is the key factor in achieving highest classification accuracy. Hence, improved gene selection method using random forest has been proposed to obtain the smallest subset of genes as well as biggest subset of genes prior to classification. The option for biggest subset selection is done to assist researchers who intend to use the informative genes for further research. Enhanced random forest gene selection has performed better in terms of selecting the smallest subset as well as biggest subset of informative genes with lowest out of bag error rates through gene selection. Furthermore, the classification performed on the selected subset of genes using random forest has lead to lower prediction error rates compared to existing method and other similar available methods.

PUBLICATION RECORD

Publication year
2011
Venue
Knowledge Technology Week
Publication date
2011-07-18
Fields of study
Biology, Medicine, Computer Science
Identifiers
DOI 10.6026/97320630007142 PMID 22125385 PMCID 3218317
External record
Open on Semantic Scholar
Source metadata
Semantic Scholar, PubMed

CITATION MAP

EXTRACTION MAP

CLAIMS

No claims are published for this paper.

CONCEPTS

No concepts are published for this paper.

REFERENCES

Computational Statistics & Data Analysis
2009cited by this paper
EGFR Activation and Ultraviolet Light-Induced Skin Carcinogenesis
2006cited by this paper
BMC Bioinformatics
2006cited by this paper
A Study on Gene Selection and Classification Algorithms for Classification of Microarray Gene Expression Data
2005cited by this paper
Application of Breiman's Random Forest to Modeling Structure-Activity Relationships of Pharmaceutical Molecules
2004cited by this paper
A comparative study of feature selection and multiclass classification methods for tissue classification based on gene expression
2004cited by this paper
An extensive comparison of recent classification tools applied to microarray data
2004influential reference
A molecular signature of metastasis in primary solid tumors
2003cited by this paper
Gene expression correlates of clinical prostate cancer behavior.
2002cited by this paper
Gene expression profiling predicts clinical outcome of breast cancer
2002cited by this paper
Prediction of central nervous system embryonal tumour outcome based on gene expression
2002cited by this paper
Classification and diagnostic prediction of cancers using gene expression profiling and artificial neural networks
2001cited by this paper
Distinct types of diffuse large B-cell lymphoma identified by gene expression profiling
2000cited by this paper
Therapeutic activity of agonistic monoclonal antibodies against CD40 in a chronic autoimmune inflammatory process
2000cited by this paper
Systematic variation in gene expression patterns in human cancer cell lines
2000cited by this paper
Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays.
1999cited by this paper
Molecular classification of cancer: class discovery and class prediction by gene expression monitoring.
1999cited by this paper
Bagging Predictors
1996cited by this paper
A Study of Cross-Validation and Bootstrap for Accuracy Estimation and Model Selection
1995cited by this paper
THE CANCER CELL
1924cited by this paper

CITED BY

Assessing the impact of waterfront trail aesthetics on psychological restoration in urban environments: a deep learning and random forest approach
2026cites this paper
A Novel Palmitoylation-Related Molecular Signature for Predicting and Therapeutically Targeting Alzheimer’s Disease
2025cites this paper
Fungal dynamics and lignocellulose depolymerization in Pinus koraiensis: impact of longhorn beetle infestation
2025cites this paper
Integrative bioinformatics approaches reveal key hub genes in cyanobacteria: insights from Synechocystis sp. PCC 6803 and Geminocystis sp. NIES-3708 under abiotic stress conditions
2025cites this paper
Sensor-Based Rock Hardness Characterization in a Gold Mine Using Hyperspectral Imaging and Portable X-Ray Fluorescence Technologies
2025cites this paper
Deciphering N7-methylguanosine-driven immune dysregulation in unexplained recurrent spontaneous abortion based on transcriptome data and experimental validation
2025cites this paper
Artificial intelligence-driven personalized clinical decision-making and drug development in breast cancer
2025cites this paper
Dysregulation of ferroptosis-related genes in granulosa cells associates with impaired oocyte quality in polycystic ovary syndrome
2024cites this paper
Feature Selection of Gene Expression Data Using a Modified Artificial Fish Swarm Algorithm With Population Variation
2024cites this paper
A novel machine learning prediction model for metastasis in breast cancer
2024cites this paper
Exploring the potential link between MitoEVs and the immune microenvironment of periodontitis based on machine learning and bioinformatics methods
2024cites this paper
Research on SAR image quality evaluation method based on improved harris hawk optimization algorithm and XGBoost
2024cites this paper
Leveraging Explainability with K-Fold Feature Selection
2023cites this paper
Identification MNS1, FRZB, OGN, LUM, SERP1NA3 and FCN3 as the potential immune-related key genes involved in ischaemic cardiomyopathy by random forest and nomogram
2023cites this paper
Electrophilicity-based charge transfer for developing aquatic-quantitative structure toxicity relationships (Aqua-QSTR)
2023cites this paper
A survey on computational learning methods for analysis of gene expression data in genomics
2022cites this paper
A new digital soil mapping method with temporal-spatial-spectral information derived from multi-source satellite images
2022cites this paper
A Prediction Model for Lung Cancer Levels Based on Machine Learning
2022cites this paper
CMAR_A_346871 909..923
2022cites this paper
Development and Validation of a New Multiparametric Random Survival Forest Predictive Model for Breast Cancer Recurrence with a Potential Benefit to Individual Outcomes
2022cites this paper
A comprehensive survey on computational learning methods for analysis of gene expression data
2022cites this paper
Environmental hazard assessment and monitoring for air pollution using machine learning and remote sensing
2022cites this paper
Feature Extraction and Classification of Colon Cancer Using a Hybrid Approach of Supervised and Unsupervised Learning
2021cites this paper
Multi-category multi-state information ensemble-based classification method for precise diagnosis of three cancers
2021cites this paper
Side effect prediction based on drug-induced gene expression profiles and random forest with iterative feature selection
2021cites this paper
Optimal Feature Set Size in Random Forest Regression
2021cites this paper
Improving random forest predictions in small datasets from two-phase sampling designs
2021cites this paper
Exploration of 2-neuron memory units in spiking neural networks
2020cites this paper
Dynamic clustering method for imbalanced learning based on AdaBoost
2020cites this paper
Discrete Wavelet Transform (DWT) and Random Forest for Cancer Detection Based on Microarray Data Classification
2020cites this paper
Vis-SWIR spectral prediction model for soil organic matter with different grouping strategies
2020cites this paper
Drug REpurposing using AI/ML tools - for Rare Diseases (DREAM-RD): A case study with Fragile X Syndrome (FXS)
2020cites this paper
Predicting prognosis of endometrioid endometrial adenocarcinoma on the basis of gene expression and clinical features using Random Forest
2019cites this paper
Using Forward / Backword Decision Trees on GSM441161 and GSE32863 Microarray Data
2019cites this paper
Knowledge and biomarkers extraction system by integrating heterogeneous information sources
2019cites this paper
Application of Hyperspectral Imaging Technology in Classification of Tobacco Leaves and Impurities
2019cites this paper
Whale optimized mixed kernel function of support vector machine for colorectal cancer diagnosis
2019cites this paper
KLASIFIKASI DATA KANKER BERDASARKAN MIKRO ARRAY GEN MENGGUNAKAN JARINGAN SYARAF TIRUAN
2019cites this paper
Cancer Classification Using Microarray Data By DPCAForest
2019influential citation
Feature Ranking using Robust Fuzzy Score Function for Gene Expression Data
2019cites this paper
A Clustering Approach for Feature Selection in Microarray Data Classification Using Random Forest
2018cites this paper
Combination of CRP and NLR: a better predictor of postoperative survival in patients with gastric cancer
2018cites this paper
A data mining framework based on boundary-points for gene selection from DNA-microarrays: Pancreatic Ductal Adenocarcinoma as a case study
2018cites this paper
Feature Importance for Human Epithelial (HEp-2) Cell Image Classification
2018cites this paper
Automatic Product Name Recognition from Short Product Descriptions
2018cites this paper
An Online Method Based on Random Forest for Air Pollutant Concentration Forecasting
2018cites this paper
On the classification techniques in data mining for microarray data classification
2018cites this paper
Review of Gene Subset Selection using Modified K-Nearest Neighbor Clustering Algorithm
2018cites this paper
Random Forest-Based Feature Importance for HEp-2 Cell Image Classification
2017cites this paper
Improved Support Vector Machine Using Multiple SVM-RFE for Cancer Classification
2017influential citation
Feature Extraction and Classification on Esophageal X-Ray Images of Xinjiang Kazak Nationality
2017cites this paper
Classification of Colorectal Cancer Using Clustering and Feature Selection Approaches
2017cites this paper
Application of unsupervised analysis techniques to lung cancer patient data
2017cites this paper
K-Means Clustering with Infinite Feature Selection for Classification Tasks in Gene Expression Data
2017cites this paper
Predictive modeling based on random forests
2017cites this paper
Gene Selection from Microarray Data for Alzheimer's Disease Using Random Forest
2017cites this paper
DETECT I & DETECT II: a study protocol for a prospective multicentre observational study to validate the UroMark assay for the detection of bladder cancer from urinary cells
2017cites this paper
An Agent-Based Clustering Approach for Gene Selection in Gene Expression Microarray
2017cites this paper
Identifying Drug–Drug Interactions by Data Mining: A Pilot Study of Warfarin-Associated Drug Interactions
2016cites this paper
Hybrid One-Class Ensemble for High-Dimensional Data Classification
2016cites this paper
Function Prediction of Disease-Related Long Intergenic Non-Coding RNA Using Random Forest
2016cites this paper
Finding a disease-related gene from microarray data using random forest
2016cites this paper
Gene selection and classification approach for microarray data based on Random Forest Ranking and BBHA
2016influential citation
Early Transcriptome Signatures from Immunized Mouse Dendritic Cells Predict Late Vaccine-Induced T-Cell Responses
2016cites this paper
Genetic and Epigenetic Variations in Asthma and Wheeze Illnesses
2015cites this paper
New approach for imbalanced biological dataset classification
2015cites this paper
Accelerating wrapper-based feature selection with K-nearest-neighbor
2015cites this paper
Improving PLS-RFE based gene selection for microarray data classification
2015cites this paper
Selecting target concept in one-class classification for handling class imbalance problem
2015cites this paper
Handling Label Noise in Microarray Classification with One-Class Classifier Ensemble
2014cites this paper
Random Forest and Gene Ontology for functional analysis of microarray data
2014cites this paper
RAFT - Real And False TFBSs
2014cites this paper
A hybrid classifier committee for analysing asymmetry features in breast thermograms
2014cites this paper
A Comparative Result Analysis of Human Cancer Diagnosis using Ensemble Classification Methods
2013cites this paper
A balanced iterative random forest for gene selection from microarray data
2013cites this paper
Combining one-class support vector machines for microarray classification
2013cites this paper
Improving the resolution of interaction maps : a middleground between high-resolution complexes and genome-wide interactomes
2013cites this paper
Multiple Gene Sets for Cancer Classification Using Gene Range Selection Based on Random Forest
2013cites this paper
Multiclass Prediction for Cancer Microarray Data Using Various Variables Range Selection Based on Random Forest
2013cites this paper
Predicting protective bacterial antigens using random forest classifiers
2012cites this paper
Aberrant expression of microRNA in gliomas: Molecular mechanisms, functional consequences and clinical significance
2012cites this paper
Characterization of ammonia-oxidizing archaea by Raman microspectroscopy
2012cites this paper