Challenges of Feature Selection for Big Data Analytics

Published 2016 in IEEE Intelligent Systems

ABSTRACT

We're surrounded by huge amounts of large-scale high-dimensional data, but learning tasks require reduced data dimensionality. Feature selection has shown its effectiveness in many applications by building simpler and more comprehensive models, improving learning performance, and preparing clean, understandable data. Some unique characteristics of big data such as data velocity and data variety have presented challenges to the feature selection problem. In this article, the authors envision these challenges for big data analytics. To facilitate and promote feature selection research, they present an open source feature selection repository (scikit-feature) of popular algorithms.

PUBLICATION RECORD

Publication year
2016
Venue
IEEE Intelligent Systems
Publication date
2016-11-07
Fields of study
Computer Science, Engineering
Identifiers
DOI 10.1109/MIS.2017.38 arXiv 1611.01875
External record
Open on Semantic Scholar
Source metadata
Semantic Scholar

CITATION MAP

EXTRACTION MAP

CLAIMS

No claims are published for this paper.

CONCEPTS

No concepts are published for this paper.

REFERENCES

Robust Unsupervised Feature Selection on Networked Data
2016cited by this paper
FeatureMiner: A Tool for Interactive Feature Selection
2016cited by this paper
Toward Time-Evolving Feature Selection on Dynamic Networks
2016cited by this paper
Unsupervised Feature Selection on Data Streams
2015cited by this paper
Unsupervised Streaming Feature Selection in Social Media
2015cited by this paper
Multi-View Clustering and Feature Learning via Structured Sparsity
2013cited by this paper
Unsupervised Feature Selection for Multi-View Data in Social Media
2013cited by this paper
Sparse methods for biomedical data
2012cited by this paper
Unsupervised feature selection for linked social media data
2012cited by this paper
Spectral Feature Selection for Data Mining
2011cited by this paper
Stable Feature Selection for Biomarker Discovery
2010cited by this paper
Parallel Large Scale Feature Selection for Logistic Regression
2009cited by this paper
Structured Variable Selection with Sparsity-Inducing Norms
2009cited by this paper
Book Review: Computational Methods of Feature Selection
2007cited by this paper
Subband correlation and robust speech recognition
2005cited by this paper
Computational Methods of Feature Selection
year unknowncited by this paper

CITED BY

An Intelligent Heart Disease Identification Framework Using Supervised Machine Learning for Clinical Decision Support
2026cites this paper
Feature selection and information fusion based on preference ranking organization method in interval-valued multi-source decision-making information systems
2025cites this paper
Performance Analysis of Proposed Scalable Reversible Randomization Algorithm (SRRA) in Privacy Preserving Big Data Analytics
2025cites this paper
Exposing Optimal Feature Sets for Enhancing Machine Learning Performance
2025cites this paper
WHHO: enhanced Harris hawks optimizer for feature selection in high-dimensional data
2025cites this paper
PCMINN: A GPU-Accelerated Conditional Mutual Information-Based Feature Selection Method
2025cites this paper
A NOVEL HYBRID FEATURE SELECTION FRAMEWORK FOR ENHANCING ACCURACY AND INTERPRETABILITY IN MACHINE LEARNING MODEL FOR STUDENT PERFORMANCE PREDICTION
2025cites this paper
Advancements in Hybrid Machine Learning Models for Biomedical Disease Classification Using Integration of Hyperparameter-Tuning and Feature Selection Methodologies: A Comprehensive Review
2025cites this paper
Modeling internal charge effects on capacitor dynamics for non-invasive estimation of membrane potential
2025cites this paper
A review of feature selection methods for actual evapotranspiration prediction
2025cites this paper
Feature Extraction Using Sparse Autoencoder in Automatic Image Tagging
2025cites this paper
Dynamic Multi-Level Competition Learning-Based Dual-Task Optimization for High-Dimensional Feature Selection
2024cites this paper
An evolutionary multiobjective method based on dominance and decomposition for feature selection in classification
2024cites this paper
Self-paced regularized adaptive multi-view unsupervised feature selection
2024cites this paper
Early Prediction of Stroke using XGBoost classification
2024cites this paper
Collaboration failure analysis in cyber-physical system-of-systems using context fuzzy clustering
2024cites this paper
Optimizing Feature Selection in Big Data: A Hybrid Spark and Fuzzy Approach
2024cites this paper
Quantum computing and quantum-inspired techniques for feature subset selection: a review
2024cites this paper
Improved fetal heartbeat detection using pitch shifting and psychoacoustics
2024cites this paper
A Comparative Analysis of Decision Tree Classifier Performance in the Medical Data Analysis
2024cites this paper
Portable automatic nutrient mixing based on microcontroller for hydroponic vegetable cultivation
2024cites this paper
Fusion of Earth Observation Data and Sociodemographic census for prediction of Anxiety using Machine Learning Algorithms – A Case Study
2024cites this paper
Ensemble Feature Selection based on Multiple Metrics and Improved Aggregation Strategies
2024cites this paper
Information gain ratio-based subfeature grouping empowers particle swarm optimization for feature selection
2024cites this paper
Hierarchical learning multi-objective firefly algorithm for high-dimensional feature selection
2024cites this paper
A Stacked Meta-Model Framework for Diabetes Prediction: From Effective Feature Engineering to Meta-Learning
2024cites this paper
Automated heart disease prediction using improved explainable learning-based technique
2024cites this paper
A Contrast Based Feature Selection Algorithm for High-dimensional Data set in Machine Learning
2024cites this paper
SFE: A Simple, Fast, and Efficient Feature Selection Algorithm for High-Dimensional Data
2023cites this paper
Random feature selection using random subspace logistic regression
2023cites this paper
Kernel Optimization for Reducing Core Vector Machine Classification Error
2023cites this paper
Effect of Principal Component Analysis on Genetic Algorithm Feature Selection
2023cites this paper
Memory-Efficient Continual Learning Object Segmentation for Long Videos
2023cites this paper
Feature space reduction method for ultrahigh-dimensional, multiclass data: random forest-based multiround screening (RFMS)
2023cites this paper
An Adaptive Streaming Feature Selection Technique for Classifying Non-Stationary Data Streams
2023cites this paper
HBDFA: An intelligent nature-inspired computing with high-dimensional data analytics
2023cites this paper
Late acceptance hill climbing aided chaotic harmony search for feature selection: An empirical analysis on medical data
2023cites this paper
Wide and deep learning based approaches for classification of Alzheimer’s disease using genome-wide association studies
2023cites this paper
Feature selection using relative dependency complement mutual information in fitting fuzzy rough set model
2023cites this paper
Momentary ride comfort evaluation of high-speed trains based on feature selection and gated recurrent unit network
2023cites this paper
GB-AFS: graph-based automatic feature selection for multi-class classification via Mean Simplified Silhouette
2023cites this paper
An Enhanced Hunger Games Search Optimization with Application to Constrained Engineering Optimization Problems
2023cites this paper
On the performance analysis of rainfall prediction using mutual information with artificial neural network
2023cites this paper
Advances in nature-inspired metaheuristic optimization for feature selection problem: A comprehensive survey
2023cites this paper
Support vector machine based disease classification model employing hasten eagle Cuculidae search optimization
2022cites this paper
A binary individual search strategy-based bi-objective evolutionary algorithm for high-dimensional feature selection
2022cites this paper
A deep learning-based diagnostic tool for identifying various diseases via facial images
2022cites this paper
A Comprehensive Survey on the Process, Methods, Evaluation, and Challenges of Feature Selection
2022cites this paper
Feature selection for online streaming high-dimensional data: A state-of-the-art review
2022cites this paper
An Improved Auto Categorical PSO with ML for Heart Disease Prediction
2022cites this paper
Weak Monotonicity With Trend Analysis for Unsupervised Feature Evaluation
2022cites this paper
Multivariable fuzzy rule-based models and their granular generalization: A visual interpretable framework
2022cites this paper
A Surrogate-Assisted Evolutionary Feature Selection Algorithm With Parallel Random Grouping for High-Dimensional Classification
2022cites this paper
Explainable Artificial Intelligence in Data Science
2022cites this paper
Intraday Trading Strategy based on Gated Recurrent Unit and Convolutional Neural Network: Forecasting Daily Price Direction
2022cites this paper
HAIVAN: a Holistic ML Analytics Infrastructure for a Variety of Radio Access Networks
2022cites this paper
A comprehensive survey on recent metaheuristics for feature selection
2022cites this paper
Application of AI in Healthcare
2022cites this paper
ECG-BiCoNet: An ECG-based pipeline for COVID-19 diagnosis using Bi-Layers of deep features integration
2022cites this paper
machine learning method for heart disease prediction using convolutional neural network
2022cites this paper
Fast Extraction Algorithm of Hadoop Message based on Artificial Intelligence Architecture
2022cites this paper
A dynamic feature selection and intelligent model serving for hybrid batch-stream processing
2022cites this paper
Knowledge representation for explainable artificial intelligence
2022cites this paper
Simulation of Distributed Big Data Intelligent Fusion Algorithm Based on Machine Learning
2022cites this paper
An Interpretable Deep Embedding Model for Few and Imbalanced Biomedical Data.
2022cites this paper
Feature Selection and Classification using a Positive Learning Approach Focused on Graph and Neural Network
2022cites this paper
Short-term probabilistic building load forecasting based on feature integrated artificial intelligent approach
2022cites this paper
E-Healthcare System for the Diagnosis of Heart Disease using MSSO-ANFIS
2021cites this paper
Hybrid feature selection approach to identify optimal features of profile metadata to detect social bots in Twitter
2021cites this paper
Q-Learning with Fisher Score for Feature Selection of Large-Scale Data Sets
2021cites this paper
Dimensions of Cybersecurity Risk Management
2021cites this paper
Bio-Inspired Data Mining for Optimizing GPCR Function Identification
2021cites this paper
Binary biogeography-based optimization based SVM-RFE for feature selection
2021cites this paper
Can empirical mode decomposition improve heartbeat detection in fetal phonocardiography signals?
2021cites this paper
HEART DISEASE IDENTIFICATION METHOD BY USING MACHINE LEARNING CLASSIFICATION
2021cites this paper
A feature selection method via analysis of relevance, redundancy, and interaction
2021cites this paper
Multi-layer linear embedding with feature subset selection
2021cites this paper
DSSAE-BBOA: deep learning-based weather big data analysis and visualization
2021cites this paper
Accelerating Analytics Using Improved Binary Particle Swarm Optimization for Discrete Feature Selection
2021cites this paper
Feature Subset Selection Based on Variable Precision Neighborhood Rough Sets
2021cites this paper
International Journal of Electrical and Computer Engineering (IJECE)
2021cites this paper
CoMB-Deep: Composite Deep Learning-Based Pipeline for Classifying Childhood Medulloblastoma and Its Classes
2021cites this paper
Planning and Design of Urban Landscape Architecture under the Background of Big Data
2021cites this paper
ZU Scholars ZU Scholars
2021cites this paper
Chaotic diffusion‐limited aggregation enhanced grey wolf optimizer: Insights, analysis, binarization, and feature selection
2021cites this paper
A concise method for feature selection via normalized frequencies
2021cites this paper
Dispersed foraging slime mould algorithm: Continuous and binary variants for global optimization and wrapper-based feature selection
2021cites this paper
Ensembling Classical Machine Learning and Deep Learning Approaches for Morbidity Identification From Clinical Notes
2021cites this paper
Dynamic Distributed and Parallel Machine Learning algorithms for big data mining processing
2021cites this paper
A fog computing data reduce level to enhance the cloud of things performance
2021influential citation
Benchmarking feature selection methods for compressing image information in high-content screening.
2021cites this paper
Feature selection based on a crow search algorithm for big data classification
2021cites this paper
Feature selection via minimizing global redundancy for imbalanced data
2021cites this paper
Effective feature representation using symbolic approach for classification and clustering of big data
2021cites this paper
A hybrid multi-class imbalanced learning method for predicting the quality level of diesel engines
2021cites this paper
Investigating the use of feature selection techniques for gender prediction systems based on keystroke dynamics
2021cites this paper
Machine learning integrated emotions detection on lockdowns in India using advanced web scraping
2021cites this paper
OPTIMIZED MACHINE LEARNING TAXONOMY TECHNIQUES FOR CARDIO DISEASE PREDICTION
2021cites this paper
An exploratory analysis of data noisy scenarios in a Pareto-front based dynamic feature selection method
2021cites this paper
A survey on feature selection methods for mixed data
2021cites this paper