Robust Feature Selection by Mutual Information Distributions

Published 2002 in Conference on Uncertainty in Artificial Intelligence

ABSTRACT

Mutual information is widely used in artificial intelligence, in a descriptive way, to measure the stochastic dependence of discrete random variables. In order to address questions such as the reliability of the empirical value, one must consider sample-to-population inferential approaches. This paper deals with the distribution of mutual information, as obtained in a Bayesian framework by a second-order Dirichlet prior distribution. The exact analytical expression for the mean and an analytical approximation of the variance are reported. Asymptotic approximations of the distribution are proposed. The results are applied to the problem of selecting features for incremental learning and classification of the naive Bayes classifier. A fast, newly defined method is shown to outperform the traditional approach based on empirical mutual information on a number of real data sets. Finally, a theoretical development is reported that allows one to efficiently extend the above methods to incomplete samples in an easy and effective way.

PUBLICATION RECORD

Publication year
2002
Venue
Conference on Uncertainty in Artificial Intelligence
Publication date
2002-06-03
Fields of study
Mathematics, Computer Science
Identifiers
arXiv 1408.1487
External record
Open on Semantic Scholar
Source metadata
Semantic Scholar

CITATION MAP

EXTRACTION MAP

CLAIMS

No claims are published for this paper.

CONCEPTS

No concepts are published for this paper.

REFERENCES

Pattern Classification
2012cited by this paper
Elements of Information Theory (Wiley Series in Telecommunications and Signal Processing)
2006cited by this paper
Elements of Information Theory
2005cited by this paper
Distribution of mutual information for robust feature selection
2002cited by this paper
KDD Cup 2001 report
2002cited by this paper
Frank
2001cited by this paper
Distribution of Mutual Information
2001influential reference
Robust discovery of tree-dependency structures
2001cited by this paper
Principal Information Theoretic Approaches
2000cited by this paper
An evaluation of Naive Bayesian anti-spam filtering
2000cited by this paper
A Tutorial on Learning with Bayesian Networks
1999cited by this paper
The posterior probability of Bayes nets with strong dependences
1999cited by this paper
UCI Repository of machine learning databases
1998cited by this paper
Feature Selection for Knowledge Discovery and Data Mining
1998cited by this paper
On the Optimality of the Simple Bayesian Classifier under Zero-One Loss
1997cited by this paper
Selection of Relevant Features and Examples in Machine Learning
1997cited by this paper
Feature Selection for Classification
1997cited by this paper
Toward Optimal Feature Selection
1996cited by this paper
MLC++: a machine learning library in C++
1994cited by this paper
Estimating functions of probability distributions from a finite set of samples.
1994cited by this paper
Irrelevant Features and the Subset Selection Problem
1994cited by this paper
Programs for Machine Learning
1994cited by this paper
C4.5: Programs for Machine Learning
1992cited by this paper
Feature Selection and Feature Extraction for Text Categorization
1992cited by this paper
Probabilistic reasoning in intelligent systems: Networks of plausible inference
1991cited by this paper
Probabilistic reasoning in intelligent systems: Networks of plausible inference
1991cited by this paper
Statistical Analysis with Missing Data
1988cited by this paper
Statistical Analysis With Missing Data
1988cited by this paper
Pattern classification and scene analysis
1974cited by this paper
Two-Dimensional Contingency Tables with Both Completely and Partially Cross-Classified Data
1974cited by this paper
Pattern classification and scene analysis
1974cited by this paper
Approximating discrete probability distributions with dependence trees
1968cited by this paper
Information Theory and Statistics
1960cited by this paper
Ieee Transactions on Knowledge and Data Engineering, to Appear Final Draft 1 a Guide to the Literature on Learning Probabilistic Networks from Data
year unknowncited by this paper

CITED BY

An explainable machine learning-based approach to predicting treatment response for neurofeedback in ADHD
2025cites this paper
A multi-model machine learning framework for breast cancer risk stratification using clinical and imaging data
2025cites this paper
A simulated annealing‐based Bayesian network structure optimization framework for late morbidity prediction with a large prospective dataset
2025cites this paper
Dual-Layer Ranking Feature Selection Method Based on Statistical Formula for Driver Fatigue Detection of EMG Signals
2022cites this paper
A novel evaluation methodology for supervised Feature Ranking algorithms
2022cites this paper
Handling Missing Values Based on Similarity Classifiers and Fuzzy Entropy Measures
2022cites this paper
MICAR: nonlinear association rule mining based on maximal information coefficient
2022cites this paper
Inertial Data-Based AI Approaches for ADL and Fall Recognition
2022cites this paper
MRI-Based Radiomic Features Help Identify Lesions and Predict Histopathological Grade of Hepatocellular Carcinoma
2022cites this paper
A novel ship classification network with cascade deep features for line-of-sight sea data
2021cites this paper
Interval‐based features of auditory ERPs for diagnosis of early Alzheimer's disease
2021cites this paper
Impact of radiogenomics in esophageal cancer on clinical outcomes: A pilot study
2021cites this paper
AN ENHANCED FORECASTING MODEL USED IN VARIOUS MARKETING SITES
2021cites this paper
White blood cell type identification using multi-layer convolutional features with an extreme-learning machine
2021cites this paper
Automatic Diabetic Foot Prediction Through Fundus Images by Radiomics Features
2021influential citation
Joint feature and instance selection using manifold data criteria: application to image classification
2020cites this paper
Feature selection algorithm based on PDF/PMF area difference
2020cites this paper
MOOC's Student Results Classification by Comparing PNN and other Classifiers with Features Selection
2020cites this paper
A comprehensive study of different feature selection methods and machine-learning techniques for SODAR structure classification
2020cites this paper
MFSC: Multi-voting based feature selection for classification of Golgi proteins by adopting the general form of Chou's PseAAC components.
2019cites this paper
Multilayer feature selection method for polyp classification via computed tomographic colonography
2019cites this paper
A Novel Weighted Combination Method for Feature Selection using Fuzzy Sets
2019influential citation
Towards Improved Drink Volume Estimation Using Filter-Based Feature Selection
2019cites this paper
A New Adaptive Rate IR-HARQ Combining with AMC
2018cites this paper
Design and Simulation of an Intelligent Self-Adaptive System for a Smart Home
2018cites this paper
On the Interplay between Global Optimization and Machine Learning
2018cites this paper
Adaptive Retransmission Protocol Based on Mutual Information
2018cites this paper
Exploration of DNA Microarrays Using Data Mining and a Taboo Search
2018cites this paper
Modélisation thématique à l'aide des plongements lexicaux issus de Word2Vec(Topic modeling with word embeddings)
2018cites this paper
Kinematic Features of Jaw and Lips Distinguish Symptomatic From Presymptomatic Stages of Bulbar Decline in Amyotrophic Lateral Sclerosis.
2018cites this paper
Selección y clasificación de genes cancerígenos utilizando un método híbrido filtro/wrapper
2017cites this paper
Experimental results for artificial intelligence-based self-organized 5G networks
2017cites this paper
An Optimization-Based Method for Feature Ranking in Nonlinear Regression Problems
2017cites this paper
Automatic Rate-Distortion Classification for the IoT: Towards Signal-Adaptive Network Protocols
2017cites this paper
A Human-Centered Approach to One-Shot Gesture Learning
2017cites this paper
Dimensionality reduction approaches and evolving challenges in high dimensional data
2017cites this paper
Biological network analysis: from topological indexes to biological applications towards personalised medicine.
2017cites this paper
Authorship verification using deep belief network systems
2017cites this paper
Short Biography
2016cites this paper
Feature selection based on correlations of gene-expression values for cancer prediction
2016cites this paper
Feature Selection for Visual Tracking
2016influential citation
Efficient High-Order Interaction-Aware Feature Selection Based on Conditional Mutual Information
2016cites this paper
Ranking to Learn: - Feature Ranking and Selection via Eigenvector Centrality
2016cites this paper
Infinite Feature Selection
2015cites this paper
Feature selection with mutual information for regression problems
2015cites this paper
User Motion Prediction in Mobile Multimedia Application
2015cites this paper
Primal explicit max margin feature selection for nonlinear support vector machines
2014cites this paper
Feature selection in computational biology
2014cites this paper
Effective and Efficient Optimization Methods for Kernel Based Classification Problems
2014cites this paper
Attribute Selection by Measuring Information on Reference Distributions
2014cites this paper
Cell morphology based classification for red cells in blood smear images
2014cites this paper
Kernel methods for gene regulatory network inference
2014cites this paper
Non-parametric feature selection for machine learning in complex settings
2013cites this paper
Click prediction in mobile display advertising based on HTML5 features
2013cites this paper
Differentiating pancreatic mucinous cystic neoplasms form serous oligocystic adenomas in spectral CT images using machine learning algorithms: A preliminary study
2013cites this paper
Bootstrap Methods for the Empirical Study of Decision-Making and Information Flows in Social Systems
2013cites this paper
Web page classification based on a binary hierarchical classifier for multi-class support vector machines
2013cites this paper
Microarray Gene-expression Data Classification using Less Gene Expressions by Combining Feature Selection Methods and Classifiers
2013cites this paper
Nearest neighbor estimate of conditional mutual information in feature selection
2012cites this paper
Multimedia features for click prediction of new ads in display advertising
2012cites this paper
EEG Features as Biomarkers for Discrimination of Preictal States
2012cites this paper
Feature selection with missing data using mutual information estimators
2012cites this paper
Feature Selection via Dependence Maximization
2012cites this paper
Multiclass Feature Selection With Kernel Gram-Matrix-Based Criteria
2012cites this paper
Explicit Max Margin Input Feature Selection for Nonlinear SVM using Second Order Methods
2012cites this paper
Semantic Web Languages
2011cites this paper
Unstable feature relevance in classification tasks
2011cites this paper
The Design of Evolutionary Multiple Classifier System for the Classification of Microarray Data
2011influential citation
Contribution au classement statistique mutualisé de messages électroniques (spam). (A contribution to shared classification of electronic messages (spam))
2011cites this paper
Representation and recognition of human actions in video
2011cites this paper
Soft computing based feature selection for environmental sound classification
2010cites this paper
reconnaissance des instruments de musique
2010cites this paper
Discriminative Topics Modelling for Action Feature Selection and Recognition
2010cites this paper
Discriminative Topics Modelling for Action Feature Selection and Recognition
2010cites this paper
An Effective Similarity Measurement for FAQ Question Answering System
2010cites this paper
Classification automatique de flux radiophoniques par Machines à Vecteurs de Support. (Automatic classification of broadcast audio streams with Support Vector Machines)
2010cites this paper
Implementation of an Hybrid Approach on FPGA for License Plate Detection Using Genetic Algorithm and Neural Networks
2009cites this paper
A good all-around semi-supervised learning algorithm for information categorization
2009cites this paper
Effective Feature Selection on Data with Uncertain Labels
2009cites this paper
Boosting selection of speech related features to improve performance of multi-class SVMs in emotion detection
2009cites this paper
YÜZ İFADE ANALİZİNDE ÖZNİTELİK SEÇİMİ VE ÇOKLU SVM SINIFLANDIRICILARINA ETKİSİ
2009cites this paper
Microarray data classification based on ensemble independent component selection
2009cites this paper
Exploiting domain and task regularities for robust named entity recognition
2009cites this paper
Bimodal information analysis for emotion recognition
2009cites this paper
Real-Time Collaborative Filtering Using Extreme Learning Machine
2009cites this paper
Modular text mining for protein-protein interactions extraction
2009cites this paper
The effect of genetic algorithm parameters on the solution of plate location detection
2008cites this paper
Lautum Information
2008cites this paper
ÇKA T ĐPĐ YAPAY S ĐNĐR A ĞI KULLANILARAK PLAKA YER Đ TESP ĐTĐNĐN FPGA'DA DONANIMSAL GERÇEKLENMES Đ
2008cites this paper
Feature selection for gene function prediction using multi-labelled lazy learning
2008cites this paper
Cancer classification using Rotation Forest
2008cites this paper
A New Approach of Feature Selection for Chinese Web Page Categorization
2008cites this paper
Supervised feature selection via dependence estimation
2007cites this paper
Feature Analysis and Classification for Filtering Junk Information in Animation
2007cites this paper
Gene selection via the BAHSIC family of algorithms
2007cites this paper
New Frameworks to Boost Feature Selection Algorithms in Emotion Detection for Improved Human-Computer Interaction
2007cites this paper
M´ethodes statistiques et algorithmes g´en´etiques pour la s´election de connaissances pr´ealables. Statistical methods and genetic algorithms for prior knowledge selection.
2007cites this paper
Prediction of T-cell epitopes based on least squares support vector machines and amino acid properties.
2007cites this paper
Capitalizing on Aggregate Data for Gaining Process Understanding––Effect of Raw Material, Environmental and Process Conditions on the Dissolution Rate of a Sustained Release Product
2007cites this paper
Feature Selection for Microarray Data Analysis Using Mutual Information and Rough Set Theory
2006cites this paper